How I re-created 2500+ stars open source project with 52 lines of code

Lesson 101 : Don’t get frightened & intimidated by GitHub stars

pip freeze | grep pkg_name
pip freeze | findstr pkg_name

Why freaking not? It’s just a shell script away!

My search wagon getting wreaked
Me
After pondering a little more
#bndr/pipreqs/issues/163 & #bndr/pipreqs/issues/159
Okay, what to do?

Let’s do it!

$ python -c "import requests; print(requests.__version__)"
2.14.2
$ python -c "import lxml; print(lxml.__version__)"
Traceback (most recent call last):
File "<string>", line 1, in <module>
AttributeError: 'module' object has no attribute '__version__'
>>> import pkg_resources
>>> pkg_resources.get_distribution("construct").version
'2.5.2'

Project Technicals

Naming the repository

… hey, what about a nice logo for the project ?

Final word

We used reaper to measure the dimensions of 1,857,423 GitHub repositories. We then used manually classified data sets of repositories to train classifiers capable of predicting if a given GitHub repository contains an engineered software project. […] The performance of the classifiers was evaluated using a set of 200 repositories with known ground truth classification. We also compared the performance of the classifiers to other approaches to classification (e.g. number of GitHub Stargazers) and found our classifiers to outperform existing approaches. We found stargazers-based classifier (with 10 as the threshold for number of stargazers) to exhibit high precision (97%) but an inversely proportional recall (32%). On the other hand, our best classifier exhibited a high precision (82%) and a high recall (86%).

The stargazer-based criteria offers precision but fails to recall a significant portion of the population.

Don’t get scared of GitHub stars. Build crazy things which help solve big problems.

Little about me

Google Code-In C. Winner. GsOCer ‘19. Independent Security Researcher. Have hacked Medium, Mozilla, Opera & many more. Personal Website: https://0x48piraj.com