How I re-created 2500+ stars open source project with 52 lines of code

Lesson 101 : Don’t get frightened & intimidated by GitHub stars

Requirements files” are files containing a list of items (Python packages) to be installed using pip install like so: pip install -r requirements. txt. Logically, a Requirements file is just a list of pip install arguments placed in a file.

pip freeze | grep pkg_name
pip freeze | findstr pkg_name

Why freaking not? It’s just a shell script away!

My search wagon getting wreaked
Me

HOLY SHIT, LOOK AT THOSE STARS

WHAT THE … 45 ISSUES! ... SO MANNNYYY PULLS

I’VE NEVER BUILT SOMETHING THAT BIG!

After pondering a little more
#bndr/pipreqs/issues/163 & #bndr/pipreqs/issues/159
Okay, what to do?
$ python -c "import requests; print(requests.__version__)"
2.14.2
$ python -c "import lxml; print(lxml.__version__)"
Traceback (most recent call last):
File "<string>", line 1, in <module>
AttributeError: 'module' object has no attribute '__version__'
>>> import pkg_resources
>>> pkg_resources.get_distribution("construct").version
'2.5.2'

Project Technicals

What is required is functions that are ‘functional’ in the mathematical sense, not programming in the ‘using functions’ sense. A mathematical function, or ‘pure function’ operates on the supplied arguments and returns a result and does nothing else. No ‘side effects’. Nothing changed by the function, no internal variables altered that will result a future call of the same function dealing with different values.

Naming the repository

Final word

We used reaper to measure the dimensions of 1,857,423 GitHub repositories. We then used manually classified data sets of repositories to train classifiers capable of predicting if a given GitHub repository contains an engineered software project. […] The performance of the classifiers was evaluated using a set of 200 repositories with known ground truth classification. We also compared the performance of the classifiers to other approaches to classification (e.g. number of GitHub Stargazers) and found our classifiers to outperform existing approaches. We found stargazers-based classifier (with 10 as the threshold for number of stargazers) to exhibit high precision (97%) but an inversely proportional recall (32%). On the other hand, our best classifier exhibited a high precision (82%) and a high recall (86%).

The stargazer-based criteria offers precision but fails to recall a significant portion of the population.

Don’t get scared of GitHub stars. Build crazy things which help solve big problems.

Little about me

--

--

Google Code-In C. Winner. GsOCer ‘19. Independent Security Researcher. Have hacked Medium, Mozilla, Opera & many more. Personal Website: https://0x48piraj.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Piyush Raj ~ Rex

Google Code-In C. Winner. GsOCer ‘19. Independent Security Researcher. Have hacked Medium, Mozilla, Opera & many more. Personal Website: https://0x48piraj.com