Skip to content

stepince/digitalreasoning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Digital Reasonal Remote Programming exercise

Below are instructions for building, testing and executing the exercise. I have checked-in all sources,classes,docs and outputs. If you have any questions please do not hesitant to call or email. tel: 201 955-8113 email: [email protected]

The ant command can be used to build, test and execute the exercise.

Unix(bash) and windows batch commands have also been provided. No arguments are necessary to execute the scripts. The scripts do have optional parameters. See each script for usage details.

USING ANT

Building

$ ant

Unit Testing

$ ant test

Executing

the output will be in part1_output.xml, part2_output.txt, part2_output.xml, part3_output.txt.

$ ant run

Javadoc

$ ant javadoc

Cleaning

$ ant clean

USING SCRIPTS

Unit Testing

$ test_part[123]

where [123] represents the question number. e.g. to run to test question 1

$ test_part1

Executing

$ run_part[123]

Where [123] represents the question number. e.g. to execute question 1.

$ run_part1

DESIGN and IMPLEMENTATION

All the java source are packaged in digitalreasoning are under /src.

Question 1.

assumptions: US locale.
limitations: size of source files, size of dictionary.
alternative implementation: use a trie for data storage, it is more memory efficient.
/src/digitalreasoning/DocumentTokenizer.java is the java source implementing for question 1. /src/digitalreasoning/TestDocumentTokenizer.java is the unit test driver source for testing question 1.

Question 2.

assumptions: US locale.
limitations: is the linear searching of proper names in a sentence, size of source files, size of dictionary.
alternative implementation: You could use a suffix tree for the sentence data structure. This would allow for constant time lookup of proper names. Ukkonen's Algorithm can build a suffix tree in O(n).

/src/digitalreasoning/ProperNameDocumentTokenizer.java is the java source implementingfor question 2.
/src/digitalreasoning/TestProperNameDocumentTokenizer.java is the unit test driver source for testing question 2.

Question 3.

assumptions: US locale.
limitations: is the linear searching of proper names in a sentence, size of source files, size of dictionary.
alternative implementation: You could use a suffix tree for the sentence data structure. This would allow for constant time lookup of proper names. Ukkonen's Algorithm can build a suffix tree in O(n).

/src/digitalreasoning/ProperNameAggregator.java is the java source implementing for question 3.
/src/digitalreasoning/TestProperNameAggregator.java is the unit test driver source for testing question 3.

About

Digital Reasoning remote programming exercise

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published