Various scripts, mostly intended to help with model training and dataset creation
-
Updated
Oct 17, 2024 - Python
Various scripts, mostly intended to help with model training and dataset creation
DALLE-tools provided useful dataset utilities to improve you workflow with WebDatasets.
Artifician is an event-driven framework designed to simplify and accelerate the process of preparing datasets for Artificial Intelligence models.
ManaTTS is the largest open Persian speech dataset with 86+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.
🚀 Whenever you need to look through huge pile of images and cannot use force of file explorer, or you just work on a remote headless machine, you can use this tool. It also allows to move files from one folder to another, creating destination if it does not exist. Work in progress.
A free licensed Persian TTS dataset including 6+ hours of audio-text pairs with subject
This repository presents a project focused on image recognition of nuts and screws using object detection techniques. The objective is to develop a model capable of accurately detecting and classifying nuts and screws in images, enabling automation and quality control in industrial settings.
Quantization Aware Training
A pipeline for machine translation (using OPUS-MT models) of parliamentary text collections in 30+ languages (ParlaMint corpora). The pipeline includes parsing TEI XLM and CONLL-u files, linguistic processing with the Stanza pipeline, machine translation and word alignment with the Eflomal tool.
Repo for bachelor thesis on CSGO encounter predictions
This repository contains code for training a convolutional neural network (CNN) model to classify images.
In this work, we will detect diabetic retinopathy from retina images. To accomplish that, we will, at first, explore the retina images thoroughly.
This repository utilizes TensorFlow Object Detection API for tomato leaf disease identification, including setup scripts, dataset preparation, model training, TensorFlow Lite conversion, and inference tools. It serves as a guide for efficient disease detection in agriculture.
imagepreprocessing
A web-based viewer for selecting dimensions from high-dimensional datasets and viewing them as a 3D cube and pairwise orthographic projections. Created for data exploration and preparation of stimuli for a spatial perception study.
Add a description, image, and links to the dataset-preparation topic page so that developers can more easily learn about it.
To associate your repository with the dataset-preparation topic, visit your repo's landing page and select "manage topics."