Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
MajidAli44 committed Jul 3, 2021
0 parents commit da0ea52
Show file tree
Hide file tree
Showing 15 changed files with 250 additions and 0 deletions.
3 changes: 3 additions & 0 deletions .idea/.gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/WEbScraping.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

53 changes: 53 additions & 0 deletions Index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
<!doctype html>
<html lang="en">
<head>
<!-- Required meta tags -->
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">

<!-- Bootstrap CSS -->
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-+0n0xVW2eSR5OomGNYDnhzAbDsOXxcvSN1TPprVMTNDbiYZCxYbOOl7+AMvyTG2x" crossorigin="anonymous">

<title>My Course</title>
</head>
<body>
<h1>Hello, Start learning!!</h1>
<div class="card" id ="card-python-for-beginners">
<div class="card-header">
Python
</div>
<div class="card-body">
<h5 class="card-title">Python for beginners</h5>
<p class="card-text">If you are new to python this is the course that you should buy!</p>
<a href="#" class="btn btn-primary">Start for 20$</a>
</div>
</div>

<div class="card" id ="card-python-web-development">
<div class="card-header">
Python
</div>
<div class="card-body">
<h5 class="card-title">Python Web Development</h5>
<p class="card-text">If you feel enough Confident with Python, you are ready to learn how to create your own website</p>
<a href="#" class="btn btn-primary">Start for 50$</a>
</div>
</div>

<div class="card" id ="card-python-machine-learning">
<div class="card-header">
Python
</div>
<div class="card-body">
<h5 class="card-title">Python Machine Learning</h5>
<p class="card-text">Become a python Machine learning Master</p>
<a href="#" class="btn btn-primary">Start for 100$</a>
</div>
</div>

<script src="https://cdn.jsdelivr.net/npm/@popperjs/[email protected]/dist/umd/popper.min.js" integrity="sha384-IQsoLXl5PILFhosVNubq5LC7Qb9DXgDA9i+tQ8Zj3iwWAwPtgFTxbJ8NT4GN1R8p" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.min.js" integrity="sha384-Atwg2Pkwv9vp0ygtn1JAojH0nYbwNJLPhwyoVbhoPwBhjQPR5VtM2+xf0Uwh9KtT" crossorigin="anonymous"></script>
</body>


</html>
36 changes: 36 additions & 0 deletions Scraping.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
from bs4 import BeautifulSoup

with open("Index.html", 'r') as html_file:
content=html_file.read()
# print(content)

soup=BeautifulSoup(content, 'lxml')
# print(soup.prettify())


# tags=soup.find('h5')
# tags = soup.find_all('h5')
# print(tags)

# courses_html_tags=soup.find_all('h5')
# for courses in courses_html_tags:
# print(courses)
# print(courses.text)

course_card=soup.find_all("div", class_="card")
for course in course_card:
# print(course)
# print(course.h5)
course_name=course.h5.text
# course_price=course.a.text
course_price = course.a.text.split()[-1]

# print(course_name)
# print(course_price)

print(f'{course_name} cost {course_price}')





114 changes: 114 additions & 0 deletions ScrapingRealWebsite.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
from bs4 import BeautifulSoup
import requests
import time

# html_text=requests.get("https://www.timesjobs.com/candidate/job-search.html?searchType=personalizedSearch&from=submit&txtKeywords=python&txtLocation=").text

# print(html_text)

# soup=BeautifulSoup(html_text,"lxml")

# jobs=soup.find('li', class_ ="clearfix job-bx wht-shd-bx")
# print(jobs)

# jobs=soup.find_all('li', class_ ="clearfix job-bx wht-shd-bx")
# print(jobs)


# For Single


# job=soup.find('li', class_ ="clearfix job-bx wht-shd-bx")
# company_name=job.find("h3", class_= "joblist-comp-name")
# skills=job.find("span", class_="srp-skills")
# category=job.find("strong", class_="blkclor").text.replace(' ','')
# published_date=job.find("span",class_="sim-posted").text
# print(published_date)

# print(f'''
# Job Type: {category}
# Company Name: {company_name.text.replace(' ','')}
# Required Skills: {skills.text.replace(' ','')}
# ''')

# print(company_name.text.replace(' ',''))
# print(skills.text.replace(' ',''))



# For Multiple

# jobs=soup.find_all('li', class_ ="clearfix job-bx wht-shd-bx")
# for job in jobs:
# company_name = job.find("h3", class_="joblist-comp-name")
# skills = job.find("span", class_="srp-skills")
# category = job.find("strong", class_="blkclor").text.replace(' ', '')
# published_date = job.find("span", class_="sim-posted").text
#
# print(f'''
# Job Type: {category}
# Company Name: {company_name.text.replace(' ', '')}
# Required Skills: {skills.text.replace(' ', '')}
# Publihed Date: {published_date}
# ''')
#
# print("--------------------------------------------------------------------")




# jobs=soup.find_all('li', class_ ="clearfix job-bx wht-shd-bx")
# for job in jobs:
# published_date = job.find("span", class_="sim-posted").text
# if 'few' in published_date:
# company_name = job.find("h3", class_="joblist-comp-name")
# skills = job.find("span", class_="srp-skills")
# category = job.find("strong", class_="blkclor").text.replace(' ', '')
#
#
# print(f'''
# Job Type: {category}
# Company Name: {company_name.text.replace(' ', '')}
# Required Skills: {skills.text.replace(' ', '')}
# Published Date: {published_date}
# ''')
#
# print("--------------------------------------------------------------------")


# Adding features to the Projects
print("Put some skills that you are not familiar with!")
unfamiliar_skill=input("> ")
print(f"Filtering Out: {unfamiliar_skill}")

def find_jobs():
html_text = requests.get("https://www.timesjobs.com/candidate/job-search.html?searchType=personalizedSearch&from=submit&txtKeywords=python&txtLocation=").text

soup = BeautifulSoup(html_text, "lxml")
jobs = soup.find_all('li', class_="clearfix job-bx wht-shd-bx")
for index, job in enumerate(jobs):
published_date = job.find("span", class_="sim-posted").text
if 'few' in published_date:
company_name = job.find("h3", class_="joblist-comp-name").text.replace(' ', '')
skills = job.find("span", class_="srp-skills").text.replace(' ', '')
more_info = job.header.h2.a["href"]

if unfamiliar_skill not in skills:
with open(f'posts/{index}.txt', 'w') as f:
f.write(f"Company Name: {company_name.strip()}\n")
f.write(f"Required Skills: {skills.strip()}\n")
f.write(f"More Info: {more_info}")
print(f"File Save {index} Successfully")




if __name__ == '__main__':
while True:
find_jobs()
time_wait=10
print(f'Waiting {time_wait} minutes')
time.sleep(10 * 60)



Empty file added ScrapingRevision.py
Empty file.
3 changes: 3 additions & 0 deletions posts/12.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Company Name: SATRAInfrastructureManagement
Required Skills: algorithms,python,debugging
More Info: https://www.timesjobs.com/job-detail/python-developer-satra-infrastructure-management-hyderabad-secunderabad-5-to-8-yrs-jobid-NN4FiC2z3rhzpSvf__PLUS__uAgZw==&source=srp
3 changes: 3 additions & 0 deletions posts/2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Company Name: GEMINISOFTWARESOLUTIONS
Required Skills: python,mobile,svn,nosql,pythonscripting,git,sqldatabase
More Info: https://www.timesjobs.com/job-detail/qa-python-python-sdet-gemini-software-solutions-gurgaon-4-to-7-yrs-jobid-jsOuZLK8chlzpSvf__PLUS__uAgZw==&source=srp
3 changes: 3 additions & 0 deletions posts/21.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Company Name: TandAHRSolutions
Required Skills: Djangoframework,PythonDeveloper,corepython
More Info: https://www.timesjobs.com/job-detail/python-developer-tanda-hr-solutions-mohali-3-to-5-yrs-jobid-GTT0grHZP1tzpSvf__PLUS__uAgZw==&source=srp
3 changes: 3 additions & 0 deletions posts/22.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Company Name: eastindiasecuritiesltd.
Required Skills: python,hadoop,machinelearning
More Info: https://www.timesjobs.com/job-detail/python-engineer-east-india-securities-ltd-kolkata-2-to-5-yrs-jobid-KEkE19WqPbFzpSvf__PLUS__uAgZw==&source=srp
3 changes: 3 additions & 0 deletions posts/23.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Company Name: YMGlobalTechnologiesPteLtd
Required Skills: python,apache,machinelearning
More Info: https://www.timesjobs.com/job-detail/python-developer-ym-global-technologies-pte-ltd-singapore-4-to-7-yrs-jobid-bRMEk1dtdLBzpSvf__PLUS__uAgZw==&source=srp
3 changes: 3 additions & 0 deletions posts/3.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Company Name: GeminiSolutions
Required Skills: python,mobile,svn,nosql,pythonscripting,git,api,sqldatabase
More Info: https://www.timesjobs.com/job-detail/qa-python-python-sdet-gemini-solutions-gurgaon-4-to-7-yrs-jobid-eGMLzwOk2QlzpSvf__PLUS__uAgZw==&source=srp

0 comments on commit da0ea52

Please sign in to comment.