Skip to content
This repository has been archived by the owner on Oct 29, 2024. It is now read-only.

Commit

Permalink
chore(text): add client sample code
Browse files Browse the repository at this point in the history
  • Loading branch information
chuang8511 committed Sep 24, 2024
1 parent ddf32f6 commit bd7391f
Show file tree
Hide file tree
Showing 5 changed files with 256 additions and 0 deletions.
60 changes: 60 additions & 0 deletions example/text/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
package example

import (
"flag"
"fmt"
"os"
"strings"

"github.com/instill-ai/component/operator/text/v0"
)

func main() {

// Parse command-line arguments
filePaths := flag.String("file_paths", "", "Comma-separated list of file paths")
chunkSize := flag.Int("chunksize", 800, "Size of each chunk")
chunkOverlap := flag.Int("chunkoverlap", 300, "Size of overlap between chunks")

flag.Parse()

// Check if file paths are provided
if *filePaths == "" {
return
}

files := strings.Split(*filePaths, ",")

for _, file := range files {
b, err := os.ReadFile(file)

if err != nil {
return
}

rawText := string(b)

input := text.ChunkTextInput{
Text: rawText,
Strategy: text.Strategy{
Setting: text.Setting{
ChunkMethod: "Markdown",
ChunkSize: *chunkSize,
ChunkOverlap: *chunkOverlap,
ModelName: "gpt-4",
},
},
}

output, err := text.ChunkMarkdown(input)

if err != nil {
return
}

for i, chunk := range output.TextChunks {
fmt.Printf("\n\nChunk %d:\n %s\n\n\n", i, chunk.Text)
}

}
}
31 changes: 31 additions & 0 deletions example/text/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
import subprocess


def execute_go_program(file_paths, chunksize, chunk_overlap):
file_paths_str = ",".join(file_paths)

subprocess.run(["go", "build", "-o", "chunk_text", "main.go"])

result = subprocess.run(
[
"./chunk_text",
f"--file_paths={file_paths_str}",
f"--chunksize={chunksize}",
f"--chunkoverlap={chunk_overlap}",
],
capture_output=True,
text=True,
)

print(result.stdout)
if result.stderr:
print("Error:", result.stderr)

subprocess.run(["rm", "chunk_text"])


file_paths = ["test_data_with_lists.md", "test_data_with_table_and_lists.md"]
chunksize = 800
chunk_overlap = 300

execute_go_program(file_paths, chunksize, chunk_overlap)
27 changes: 27 additions & 0 deletions example/text/main_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
package example

import (
"os"
"strings"
"testing"
)

// TestMainFile only confirms the client code runs without errors
func TestMainFile(t *testing.T) {

files := []string{
"test_data_with_lists.md",
"test_data_with_table_and_lists.md",
}

os.Args = []string{
"example",
"-file_paths", strings.Join(files, ","),
"-chunksize", "800",
"-chunkoverlap", "200",
}

// Run the main function
main()

}
75 changes: 75 additions & 0 deletions example/text/test_data_with_lists.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
## Introduction to urinary tract infection
Urinary tract infections are common healthcare-associated infections, and most healthcare-associated urinary tract infections are caused by the use of equipment in the urinary tract system. Healthcare-associated urinary tract infections are divided into catheter-associated urinary tract infections and non-catheter-associated urinary tract infections. According to statistics, about 12-16% of hospitalized adult patients will receive an indwelling urinary catheter during hospitalization. According to research, it is found that for every day a urinary catheter is left in place, the patient will develop a catheter-related urinary tract infection. , CAUTI), the risk increases by 3-7%.

Complications caused by CAUTI are quite diverse, including: male prostatitis, adhesions, testicularitis, cystitis, pyelonephritis, bacteremia, endocarditis, spinal osteomyelitis, septic arthritis, endophthalmitis , meningitis, etc. The occurrence of CAUTI and its subsequent complications will lead to patient suffering, prolonged hospitalization, increased medical costs and mortality. According to data from Taiwan’s Nosocomial Infection Surveillance Notification System, in 2016, urinary tract infections in hospitals above the regional level in Taiwan accounted for 36% of all medical care-related infections, ranking second among all infection sites, and 91% of them were related to the use of urinary catheters. Related.

## Explanation of urinary tract infection terms
1. The following criteria apply for the determination of medical care-related infections:
- Infection present on admission (POA)
- Healthcare-associated infections (HAI)
- Date of infection (DOE): For urinary tract infections, the date of infection (DOE) refers to the date of the first occurrence of an infection that meets the criteria for the definition of urinary tract infection surveillance within the 7-day infection intake period (IWP).
- Infection Wrap-up Period (IWP)
- Reinfectious period (RIT)
- Attributable period for secondary bloodstream infection
2. Indwelling catheter: A drainage tube inserted into the bladder through the urethra and left in place, and the end of the tube should be connected to a drainage bag (including leg bag); such a device is also called a foley catheter . Therefore, condom catheters, straight in-and-out catheters, nephrostomy tubes, ileoconduitis or suprapubic catheters are not included unless a urinary catheter is also retained. . Indwelling urethral catheters used for intermittent or continuous irrigation are also included in catheter-associated urinary tract infection surveillance.
3. Catheter-Associated Urinary Tract Infection (CAUTI):
1. The day the indwelling catheter is placed is the first day of catheter use. On the date of urinary tract infection (DOE), the patient has used the indwelling catheter for more than 2 calendar days, and on the date of urinary tract infection (DOE) On the same day or the day before, the patient still used an indwelling urinary catheter.
2. Indwelling urinary catheters that are removed and re-inserted: The purpose of infection surveillance is not to monitor whether a specific urinary catheter has become infected, but to monitor whether the indwelling catheter causes urinary tract infection in the patient. monitor the risks.
- If the patient remains without an indwelling catheter for at least 1 full calendar day (not calculated as 24 hours) after removal of the indwelling catheter, the number of days of catheter use will be reset when the catheter is reinserted. Count from 1;
- If a new indwelling urinary catheter is reinserted before 1 full calendar day has elapsed after removal, the number of days of catheter use will continue to count.
- If the patient's date of urinary tract infection (DOE) is on the second day after the catheter is inserted, the case cannot be admitted as catheterization because the catheter has not been left in place for more than 2 calendar days on the date of infection (DOE). tube-associated urinary tract infection (CAUTI); however, whether the patient qualifies as a medical care-associated urinary tract infection (UTI) case can still be determined based on the date of admission.

## Urinary tract infection surveillance definition
1. Catheter-associated urinary tract infection (CAUTI) determination criteria: The patient must meet the following 3 conditions, and all conditions must occur within the infection admission period (IWP):
1. The day the indwelling urinary catheter is placed is the first day of catheter use. The patient has an indwelling urinary catheter for more than 2 calendar days on the date of infection (DOE), and the urinary catheter is used on the date of infection (DOE). Still retained or removed 1 day before date of infection (DOE)
2. The patient has at least one of the following signs or symptoms:
- Fever (>38.0°C)
- suprapubic tenderness
- Costovertebral angle pain or tenderness
- Urinary urgency; but not applicable during indwelling urinary catheter
- Frequent urination (urinary frequency); but not applicable during indwelling urinary catheter
- Difficulty or pain in urination (dysuria); but not applicable during indwelling urinary catheter
3. No more than 2 types of microorganisms are cultured in urine, and at least one of them has a colony count ≥ 100000 cfu/ml.

2. Non-catheter related urinary tract infection (Non-CAUTI) criteria: The patient must meet the following 3 conditions, and all conditions must occur within the Infection Wrap-up Period (IWP):
1. The day when the indwelling urinary catheter is placed is the first day of catheter use. The patient has an indwelling urinary catheter for no more than 2 calendar days on the date of infection (DOE), or between the date of infection (DOE) and the date of infection ( DOE) No indwelling urinary catheter was used in the previous day
2. Have at least one of the following signs or symptoms:
- Fever (>38.0°C); this only applies to patients ≤ 65 years old
- suprapubic tenderness
- Costovertebral angle pain or tenderness
- Urinary urgency
- Urinary frequency
- Difficulty or pain in urinating (dysuria)
3. No more than 2 types of microorganisms are cultured in urine, and at least one of them has a colony count ≥ 100000 cfu/ml.

3. Catheter-associated urinary tract infection and non-catheter-associated urinary tract infection (CAUTI or Non-CAUTI in patients 1 year of age or less) in infants and young children less than 1 year old (inclusive): The patient must meet the following 3 conditions , and all conditions must occur within the Infection Warranty Period (IWP):
1. Patient age ≦ 1 year old (regardless of whether a urinary catheter is indwelling)
2. Have at least one of the following symptoms or signs:
- Fever (rectal temperature >38.0°C)
- Hypothermia (rectal temperature <36.0°C)
- apnea
- Bradycardia (bradycardia)
- Lethargy
- vomiting
- suprapubic tenderness
- Costovertebral angle pain/tenderness
3. No more than 2 types of microorganisms are cultured in urine, and at least one of them has a colony count ≥ 100000 cfu/ml.
4. Based on the patient's catheter usage, determine whether the case is consistent with catheter-associated urinary tract infection (CAUTI) or a general urinary tract infection case.

4. Determination criteria for asymptomatic bacteremic UTI (ABUTI): The patient must meet the following 3 conditions, and all conditions must occur within the infection record period (IWP):
1. Regardless of whether the patient has an indwelling urinary catheter, the patient does not have any symptoms or signs that qualify for symptomatic urinary tract infection admission.
- Based on the patient's catheter usage, determine whether the case is a catheter-associated urinary tract infection (CAUTI) or a general urinary tract infection.
2. No more than 2 types of microorganisms are cultured in urine, and at least one of them has a colony count ≥ 100000 cfu/ml
3. Microorganisms are detected in blood samples through culture or other non-culture microbiological testing methods, and at least one of them is consistent with the microorganisms cultured in urine with a colony count ≥ 100,000 cfu/ml.
- If a patient over 65 years old does not use a urinary catheter but has symptoms of fever (>38.0°C), he or she may still meet the admission criteria for asymptomatic bacteriuria.

## Urinary tract infection notification precautions
1. Suprapubic tenderness:
- Information can come from palpation (tenderness-sign) or patient-reported symptoms (pain-symptom), as long as the relevant information is recorded in the medical record and the date of the symptom record is within the infection record period (IWP). Symptomatic urinary tract infection (SUTI) can be included in the admission conditions.
- Lower abdominal pain, bladder or pelvic discomfort, etc., can be regarded as symptoms of suprapubic tenderness; however, general abdominal pain recorded in medical records cannot be used as a basis for suprapubic tenderness, because there are many causes of abdominal pain, and such symptoms are very common .
2. Pain in the lower left or lower right side of the back or flank can be regarded as a symptom of costovertebral angle pain/tenderness; however, general low back pain recorded in the medical record cannot be used as a basis for costovertebral angle pain/tenderness.
3. "Mixed flora" (Mixed flora) cannot be reported as the causative agent of medical care-related infections. As long as "Mixed flora" appears in the culture report of a urine specimen, it means that the test result of this specimen cannot be used as a compliance test. Criteria for determining urinary tract infection.

## Determination of urinary tract infection ward
1. Infectious ward refers to the ward where the patient stayed on the date of infection (DOE).
2. Transfer Rule: If the date of infection (DOE) is on the day of transfer out of the ward or discharge or the next day, the infection ward will be the transfer out ward/discharge location; but if the date of infection (DOE) is on the same day as the transfer out ward/discharge location Or if the patient has been transferred multiple times on the previous day, the infected ward will be determined as the first ward on the day before the date of infection (DOE).
63 changes: 63 additions & 0 deletions example/text/test_data_with_table_and_lists.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# Comprehensive Project Overview

This document provides an in-depth overview of the project, outlining key tasks, detailed steps for implementation, and an extended table showing team assignments, deadlines, and status.

## Detailed Tasks

### Unordered List
- **Set up project environment:**
- installing the necessary dependencies
- setting up virtual environments
- configuring local development environments for each team member
- ensuring that everyone can run the project locally without any issues.
- **Configure CI/CD pipeline:**
- selecting the appropriate CI/CD tools
- writing configuration files for continuous integration (CI) and continuous deployment (CD)
- integrating automated tests into the pipeline, and ensuring smooth deployment to staging environments.
- **Write comprehensive unit and integration tests:**
- Writing unit tests that cover all critical components of the system.
- This also includes creating integration tests to verify the interaction between different modules and ensuring the system works end-to-end.
- **Deploy to production environment:**
- This step involves preparing the production environment, running final smoke tests, and deploying the codebase.
- It also includes monitoring for any post-deployment issues and ensuring system stability during the initial production run.

### Ordered List
1. **Research new features:** Conduct thorough research on the requirements for new features by analyzing user feedback, market trends, and competitor offerings. Document findings and propose possible solutions.
2. **Discuss with team:** Hold brainstorming sessions and team meetings to refine the ideas. Discuss any technical challenges, dependencies, or potential risks with cross-functional teams like design, product management, and operations.
3. **Implement the selected feature:** Begin the development process.
1. This includes designing, coding, peer reviewing,
2. conducting initial testing of the feature. Ensure that the code follows best practices and adheres to the project’s coding standards.
4. **Test thoroughly before release:** Perform exhaustive testing on all changes made, including regression tests, performance tests, and user acceptance testing (UAT). Coordinate with the QA team to validate the functionality in both staging and pre-production environments.

## Extended Team Assignments

The following table details the tasks assigned to each team member, their associated deadlines, priorities, and current status:

| Team Member | Task | Priority | Deadline | Status | Comments |
| ------------- | ----------------------------------- | ---------- | --------------- | ----------------- | ------------------------------------------------------------ |
| Alice | Set up project environment | High | 2024-09-25 | Completed | Environment is set up with all necessary tools and configs. |
| Bob | Configure CI/CD pipeline | Medium | 2024-09-28 | In Progress | CI is working, but some issues remain with CD to production. |
| Charlie | Write unit and integration tests | High | 2024-10-05 | Not Started | Awaiting completion of core features before starting testing. |
| Dave | Deploy to production environment | Low | 2024-10-15 | Not Started | Needs confirmation from DevOps team for final approval. |
| Eve | Design feature mockups | Medium | 2024-09-27 | Completed | Mockups approved by product and ready for development. |
| Frank | Database schema updates | High | 2024-10-01 | In Progress | Migration scripts in progress; awaiting review. |
| Grace | Security audit and penetration tests| Critical | 2024-10-10 | Not Started | Scheduled for after the feature implementation phase. |



## Project Milestones and Key Dates

1. **Planning and Research Phase**
- Complete by: 2024-09-30
- Responsible: Alice, Eve
- Details: This phase focuses on gathering requirements, performing technical research, and defining the high-level architecture for the project.

2. **Feature Implementation Phase**
- Complete by: 2024-10-15
- Responsible: Bob, Charlie, Frank
- Details: During this phase, the development team will implement the agreed-upon features, create necessary database updates, and write unit and integration tests.

3. **Testing, Security, and Final Deployment**
- Complete by: 2024-10-20
- Responsible: Grace, Dave
- Details: This phase focuses on rigorous testing, including performance, security, and user acceptance testing. Final deployment to production will occur, followed by monitoring and issue resolution.

0 comments on commit bd7391f

Please sign in to comment.