Skip to content

Latest commit

 

History

History
126 lines (97 loc) · 3.63 KB

README.md

File metadata and controls

126 lines (97 loc) · 3.63 KB

Intro to automated machine learning directly from SQL using IntegratedML feature in InterSystems IRIS. You can find more information in the documentation

Have a look at the slides in Workshop-IntegratedML-Intro.pdf

Setup

Run the container we will use as our IRIS instance:

docker-compose up -d

After that, you should be able to access to Management Portal.

Prepare the data

Check the data

We are going to use a dataset about health risks for pregnant patients. Have a look at the details here: https://www.kaggle.com/csafrit2/maternal-health-risk-data

We have already included in the repository a data file.

Split training and test data

We need to split the data into training and test datasets. Prepare data simply split the data and generate a train and test CSV files. Have a look at the code using Embedded Python.

Run the method in a WebTerminal session.

do ##class(Workshop.Util).PrepareData()

Load CSV files

Train Data

Create train table:

CREATE TABLE Workshop_Data.MaternalTrain (
        Age INT,
        SystolicBP INT,
        DiastolicBP INT,
        BS DOUBLE,
        BodyTemp DOUBLE,
        HeartRate INT,
        RiskLevel VARCHAR(255)
)

Load train data:

LOAD DATA FROM FILE '/app/data/train.csv'
INTO Workshop_Data.MaternalTrain (Age,SystolicBP,DiastolicBP,BS,BodyTemp,HeartRate,RiskLevel)
VALUES (Age,SystolicBP,DiastolicBP,BS,BodyTemp,HeartRate,RiskLevel)
USING {"from":{"file":{"charset": "UTF-8"}}}

Display train data:

SELECT * FROM Workshop_Data.MaternalTrain

Test Data

Create test table:

CREATE TABLE Workshop_Data.MaternalTest (
        Age INT,
        SystolicBP INT,
        DiastolicBP INT,
        BS DOUBLE,
        BodyTemp DOUBLE,
        HeartRate INT,
        RiskLevel VARCHAR(255)
)

Load test data:

LOAD DATA FROM FILE '/app/data/test.csv'
INTO Workshop_Data.MaternalTest (Age,SystolicBP,DiastolicBP,BS,BodyTemp,HeartRate,RiskLevel)
VALUES (Age,SystolicBP,DiastolicBP,BS,BodyTemp,HeartRate,RiskLevel)
USING {"from":{"file":{"charset": "UTF-8"}}}

Display test data:

SELECT * FROM Workshop_Data.MaternalTest

Create your model

IntegratedML feature works directly in SQL. You can use the SQL Explorer in the Management Portal or use an external JDBC tool like DBeaver.

Create a model to predict RiskLevel:

CREATE MODEL MaternalModel PREDICTING (RiskLevel) FROM Workshop_Data.MaternalTrain

Training the model

You can now train the model using the training data

TRAIN MODEL MaternalModel

Validating the model

Evaluate the performance of the predictions of your model

VALIDATE MODEL MaternalModel FROM Workshop_Data.MaternalTest
SELECT * FROM INFORMATION_SCHEMA.ML_VALIDATION_METRICS

Using the model

Finally you can run predictions on RiskLevel. Here you can compare the predictions Vs. real data:

SELECT *, PREDICT(MaternalModel) AS PredictedRisk,PROBABILITY(MaternalModel FOR 'mid risk') As Probability
FROM Workshop_Data.MaternalTest

SELECT *, PREDICT(MaternalModel) AS PredictedRisk, PROBABILITY(MaternalModel FOR 'mid risk') As Probability
FROM Workshop_Data.MaternalTest 
WHERE PROBABILITY(MaternalModel FOR 'mid risk') > 0.7