Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/jplag implementation #447

Open
wants to merge 15 commits into
base: development
Choose a base branch
from

Conversation

JackSCarroll
Copy link

@JackSCarroll JackSCarroll commented Sep 20, 2024

Depends on doubtfire-lms/doubtfire-deploy#28
Depends on doubtfire-lms/doubtfire-web#873
Feasibility document: thoth-tech/documentation#528

Description

This branch adds JPLAG software plagiarism detection to OnTrack. The unit_similarity_module.rb file has been modified to extend the check_similarity function to perform a docker exec command to generate a jplag report for each task with "done" files in a unit.
It extracts the code files from the "done" zip files and temporarily stores them in tmp/jplag/unitcode/task while the jplag report is being generated.

Type of change

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • Use the modified docker-compose and added jplag.Dockerfile from Feature/add jplag plagiarism detection #28
  • Create a new task in a unit and have at least 2 student users upload similar code files for the task
  • Change task due date to be in the past, so submissions can be accepted in the check_similarity check
  • Run rake submission:generate_pdfs while cd'ed in doubtfire-api. This will create the "done" files.
  • Run rake submission:check_plagiarism while cd'ed in doubtfire-api. This will run jplag on all units and tasks.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation if appropriate
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have created or extended unit tests to address my new additions
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@macite
Copy link
Member

macite commented Sep 20, 2024

Can you recreate commit 90ef645 with a chore tag at the start of the commit message.

Copy link
Member

@macite macite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good start, but there seems to be quite a bit to do still.

# making temp directory for each task - jplag
tasks_dir = root_work_dir.join(td.id.to_s)
FileUtils.mkdir_p(tasks_dir)

next if td.moss_language.nil? || td.upload_requirements.nil? || td.upload_requirements.select { |upreq| upreq['type'] == 'code' && upreq['tii_check'] }.empty?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this still be moss_language?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using the upload requirement language from the task definition ui task-definition-upload.component.html.
Feel like it makes sense to use existing language definition considering the task creator has already decided if plagiarism checks are necessary from MOSS.
So if no requirement for MOSS, no requirement for JPLAG.
Let me know if you don't think is this appropriate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this we should remove the old moss code, so these could be renamed...

t.extract_file_from_done(tasks_dir, pattern, ->(_task, to_path, name) { File.join(to_path.to_s, t.student.username.to_s, name.to_s) })
end
puts "Starting JPLAG container to run on #{tasks_dir}"
tasks_dir_split = tasks_dir.to_s.split('/workspace/doubtfire-api')[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hardcoded dev container path... won't work in production

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have changed to Rails.root

app/models/similarity/unit_similarity_module.rb Outdated Show resolved Hide resolved

# Check if the directory exists and create it if it doesn't
results_dir = "/jplag/results/#{unit_code}"
`sudo docker exec jplag sh -c 'if [ ! -d "#{results_dir}" ]; then mkdir -p "#{results_dir}"; fi'`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sudo?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be outside the upload requirements loop? What happens if you have multiple files being checked in the same task definition?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sudo is needed to make the docker exec commands work. Not entirely sure why 🙃.
Have moved the dir checks outside the loop and still works as expected.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker exec doesn't require sudo in production since it'll run as the root user, but the permission errors in the devcontainer are due to it using the vscode user

You can fix this by setting permissions on /var/run/docker.sock in https://github.com/doubtfire-lms/doubtfire-deploy

result_file = "#{results_dir}/#{task_definition.id}-result.zip"
`sudo docker exec jplag sh -c 'if [ -f "#{result_file}" ]; then rm "#{result_file}"; fi'`

case type_data[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this all the languages available?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because some of the language names in MOSS and JPLAG differ, i've added a check for c++ and python.
Currently only supporting languages defined from the task definition ui task-definition-upload.component.html
Have added comments to both task-definition-upload.component.html and unit-similarity-module.rb as reminder to check MOSS and JPLAG language defintions

app/models/similarity/unit_similarity_module.rb Outdated Show resolved Hide resolved
app/models/similarity/unit_similarity_module.rb Outdated Show resolved Hide resolved
app/models/similarity/unit_similarity_module.rb Outdated Show resolved Hide resolved
puts "Files to delete: #{Dir.glob("#{tmp_dir}/*")}"
FileUtils.rm_rf(Dir.glob("#{tmp_dir}/*"))

self
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are the similarity matches created?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to parse the report and link related tasks

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a function to create the similarity matches. Here's the commit.
It opens the overview.json file from the report and extracts the max similarity between 2 submissions.
It also finds the task id by looking in /files/student_name/task_id and grabbing the task id folder name; it matches the student names from the two that are being looked at currently in overview.json.

Copy link
Member

@macite macite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we need to remove the Moss code, switch the languages in the task definitions, and add in some unit tests.

# making temp directory for each task - jplag
tasks_dir = root_work_dir.join(td.id.to_s)
FileUtils.mkdir_p(tasks_dir)

next if td.moss_language.nil? || td.upload_requirements.nil? || td.upload_requirements.select { |upreq| upreq['type'] == 'code' && upreq['tii_check'] }.empty?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this we should remove the old moss code, so these could be renamed...

root_dir = Rails.root.to_s
tasks_dir_split = tasks_dir.to_s.split(root_dir)[1]

# Set the file language based on the type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can be removing MOSS with this - so the languages can be updated to those we want from JPlag. This would be part of the front end change as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants