Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

https://github.com/valperaled/DSaS/blob/4e735a8763c7a8eb11ce64c3fcc25e8a618acde1/Assignment%206/crime-analytics-EAPV.ipynb #70

Open
wants to merge 3 commits into
base: capstone
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 2 additions & 37 deletions assignment4/README.txt
Original file line number Diff line number Diff line change
@@ -1,38 +1,3 @@
Instructions on how to run example.pig.
This repository describes an assignment for working with ~600GB of graph data using modern analytics languages.

================================================================

STEP 1:

Importing the myudfs.jar file in pig. You need this because
example.pig uses the function RDFSplit3(...) which is defined in myudfs.jar:

OPTION 1: Do nothing. example.pig is already configured to read
myudfs.jar from S3, through the line:

register s3n://uw-cse-344-oregon.aws.amazon.com/myudfs.jar


OPTION 2: do-it-yourself; run this on your local machine:

cd pigtest
ant -- this should create the file myudfs.jar

Next, modify example.pig to:

register ./myudfs.jar

Next, after you start the AWS cluster, copy myudfs.jar to the AWS
Master Node (see hw6-awsusage.html).

================================================================

STEP2

Start an AWS Cluster (see hw6-awsusage.html), start pig interactively,
and cut and paste the content of example.pig. I prefer to do this line by line


Note: The program may appear to hang with a 0% completion time... go check the job tracker. Scroll down. You should see a MapReduce job running with some non-zero progress.

Also note that the script will generate more than one MapReduce job.
Start with assignment4.md
4 changes: 3 additions & 1 deletion capstone/blight/blightfight.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,9 @@ Some articles you may find useful:
* [The relationship between abandonment and crime](readings/RALEIGH_et_al-2015-Journal_of_Urban_Affairs.pdf)
* [Detroit demolishes its ruins: 'The capitalists will take care of the rest'](http://www.theguardian.com/money/2014/sep/28/detroit-demolish-ruins-capitalists-abandoned-buildings-plan)

### Discussion Prompt: Share your background, interest, and goals for this Capstone Project, and any questions or considerations from your domain research. How important is this problem? How accurate do you think the models will be? What kinds of concerns might there be around equity? For example, in some cities, 311 calls may be rare in poor neighborhoods, so a model that predicts abandonment that uses 311 calls may favor certain neighborhoods over others.
### Discussion Prompt:

Share your background, interest, and goals for this Capstone Project, and any questions or considerations from your domain research. How important is this problem? How accurate do you think the models will be? What kinds of concerns might there be around equity? For example, in some cities, 311 calls may be rare in poor neighborhoods, so a model that predicts abandonment that uses 311 calls may favor certain neighborhoods over others.


## Week 2. Create a list of "buildings" from a list of geo-located incidents
Expand Down