Skip to content
This repository has been archived by the owner on Mar 9, 2021. It is now read-only.

New Feature Requests and Possible Enhancements

Johannes Meyer zum Alten Borgloh edited this page Aug 26, 2017 · 11 revisions

Ideas to include/enhance the application

  • Parse Options from the commandline and add non-UI mode.

  • Use the posts creation time as file creation time.

  • Maybe add 500px, instagram, flickr, pinterest or other tumblr similar pages?

  • ~Download only original content from a Tumblr blog, no reblogs.~~

  • Redesign the UI layout and the controls.

  • Re-download of a blog.

  • File rename function. Multiple possible implementation variants:

    • Freely defined pattern based on user input.
    • After posts name.
    • Add a leading number indicating the blogs post order.
  • Implement a filesystem watcher that reads .txt files from a user-definable folder with new instructions. E.g. download blog X with tags Y now.

  • Theme support.

  • Move the blog index (i.e. .tumblr) files to a separate, user definable folder?

  • Linux (G)UI

Code Enhancements

  • Create a top to bottom async/await implementation. At some points there are still I/O bound methods wrapped in Task.Run() which starts a additional thread on the threadpool which is most likely unnecessary. I've started a new async branch where I've added an AsyncDelegateCommand so that everything except the url grabber and downloader thread should be converted to async/await I guess.
  • Use more events for database/UI updates instead of methods.
  • We could combine all the WebRequest code including a resumable FileDownload method etc in one class and add nice events which update the UI and provide the current downloadspeed in kb/s or update other statistics.
  • Redesign and clean up the database structures.
  • Repository pattern.
  • Unit tests, probably to late :/ and probably my task.
  • Documentation (probably my task).
  • Make writes async. Right now every open file uses a thread from the threadpool
  • In the Downloader.cs class, clear the statisticsBag after stats have been processed to free memory for downloads of large blogs. The grabber should be way faster than the downloader.
  • In the AbstractDownloader.cs class, remove a Task from the trackedTasks list after a file download was successfully completed to free memory for downloads of large blogs.
  • Use composition instead of inheritance for the Crawler classes as it's already messy?
  • Use HttpClient instead of HttpWebRequests for downloading.
  • More general code refactoring ..
  • ..

How to add new website crawler

It should be quite straight forward now to add new sites like 500px, instagram, or twitter. The downloader and most of the UI should be able to take different websites.

You can check this commit for my addition of the Tumblr tag search downloader for the most recent example. Older examples include the Tumblr liked-by downloader, the Tumblr downloader for private blogs and the Tumblr search downloader.

In essence, you have to do:

  • Implement the ICrawler interface and override the Crawl method to start the crawler and a DownloadBlogAsync task from the IDownloader interface. See here for an example implementation in the TumblrTagSearchCrawler.
  • The url validator needs adjustments to detect proper urls.
  • Add your BlogType.
  • Add your Crawler to the CrawlerFactory.
  • Add your BlogType to the BlogFactory.
  • Update the CanAdd() method in the ManagerController.cs.
  • You might want to add a new DetailsView.cs if you want different checkboxes or statistics.
Clone this wiki locally