-
- Python script to scrap a single TechCrunch Page / Article and write to MongoDb hosted in mlab
-
- Find all the latest post url and pass it to singletechcrunchpaper.py.
-
- Script for the crontab job, run excatly one time everyday.
- SparkMongoConnector.scala
- Scala singleton class to connect and perform basic operation on data
-
Python libs
-
DB Used
-
DB Connector
-
Data Processing
-
Os scheduling
- application.conf file which contains the mongoDb username and password and link