Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic etl pipeline #61

Merged
merged 3 commits into from
Feb 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 9 additions & 2 deletions pages/index.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -108,8 +108,15 @@ export default function Home() {
</Link>
<p className="text-xs">Regular PDF</p>
<h4 className="text-2xl font-bold mt-12 mb-4">Something to geek out on</h4>
<p className="mb-4 ">
This site and the following files are generated from <a href="https://jsoncrack.com/editor?json=https://raw.githubusercontent.com/yuhonas/clintp.xyz/main/resume/resume.clintp.json" className="underline">resume.clintp.json</a> which conforms to <a href="https://jsonresume.org/" target="_blank" className="underline">JSON Resume</a></p>
<ul className="mb-4 list-disc">
<li>My resume is stored in <a href="https://jsonresume.org/" className="underline">JSON Schema</a> format, it can be viewed at <a href="https://raw.githubusercontent.com/yuhonas/clintp.xyz/main/resume/resume.clintp.json" className="underline">resume.clintp.json</a></li>
<li>The resume is automagically linted/spell checked and built into various formats</li>
<li>You can explore the data using tools like <a href="https://jsoncrack.com/editor?json=https://raw.githubusercontent.com/yuhonas/clintp.xyz/main/resume/resume.clintp.json" className="underline">JSON Crack</a> or
&nbsp;<a href="https://lite.datasette.io/?json=https://raw.githubusercontent.com/yuhonas/clintp.xyz/main/resume/resume.clintp.json#/data/resume?_sort=rowid&_facet=name&_facet=location" className="underline">lite.datasette.io</a>
</li>
<li>As part of the <a href="https://github.com/yuhonas/clintp.xyz/blob/main/.github/workflows/ci.yml" className="underline">resume build</a>, a gist is updated which enables the resume to be published on <a href="https://registry.jsonresume.org/yuhonas" className="underline">registry.jsonresume.com</a>, see also <a href="https://registry.jsonresume.org/yuhonas?theme=stackoverflow" className="underline">here</a> for an example of the resume rethemed </li>
</ul>
<p className="mb-4">This following files are also available</p>
<ul className="mb-4 flex gap-3">
{/* <li>
<Link href="https://jsoncrack.com/editor?json=https://raw.githubusercontent.com/yuhonas/clintp.xyz/main/resume/resume.clintp.json" className="hover:underline">
Expand Down
6 changes: 6 additions & 0 deletions resume/Pipfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,12 @@ sumy = "*"
py-readability-metrics = "*"
check-jsonschema = "*"
prefect = "*"
luigi = "*"
pandas = "*"
requests = "*"
qrcode = "*"
segno = "*"
qrcode-artistic = "*"

[requires]
python_version = "3.9"
133 changes: 125 additions & 8 deletions resume/Pipfile.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Empty file added resume/__init__.py
Empty file.
Binary file added resume/build/clintp-qrcode.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added resume/build/clintp.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
68 changes: 68 additions & 0 deletions resume/examples/transform.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
import luigi
import json
import segno
from segno import helpers
from urllib.request import urlopen, urlparse
from PIL import Image


class ExtractResume(luigi.Task):
output_file = luigi.Parameter(default='resume.clintp.json')

def run(self):
pass

def output(self):
return luigi.LocalTarget(self.output_file)


class FetchAndConvertProfileImage(luigi.Task):
def requires(self):
return {
'resume': ExtractResume()
}

def run(self):
resume = json.load(open(self.input()['resume'].path))
profile_image = urlopen(resume['basics']['image'])

image = Image.open(profile_image).quantize(
colors=256, method=2).convert('RGB')
pixelated = image.resize((40, 40)).resize(image.size, Image.NEAREST)

pixelated.save(self.output().path)

def output(self):
return luigi.LocalTarget("build/clintp.gif")


class GenerateQrCode(luigi.Task):
def requires(self):
return {
'resume': ExtractResume(),
'profile_image': FetchAndConvertProfileImage()
}

def run(self):
# access the resume that was generated in a previous task

resume = json.load(open(self.input()['resume'].path))
# parse the site_url and remove the protocol as it's included for some reason in the qrcode

parsed_url = urlparse(resume['basics']['url'])
url_without_protocol = parsed_url.netloc + parsed_url.path

qrcode = helpers.make_mecard(name=resume['basics']['name'],
email=(resume['basics']['email']),
url=url_without_protocol)

qrcode.to_artistic(background=self.input()['profile_image'].path,
target=self.output().path, scale=5)

def output(self):
return luigi.LocalTarget("build/clintp-qrcode.gif")


if __name__ == "__main__":
# luigi.build([TaskD()], workers=1, local_scheduler=True)
luigi.build([GenerateQrCode()], workers=1, local_scheduler=True)
4 changes: 0 additions & 4 deletions resume/resume.clintp.json
Original file line number Diff line number Diff line change
Expand Up @@ -315,10 +315,6 @@
}
],
"references": [
{
"name": "Danny Chin",
"reference": "It is my pleasure to recommend Richard, his performance working as a consultant for Main St. Company proved that he will be a valuable addition to any company."
}
],
"projects": [
{
Expand Down
Loading