Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a question about time series classification #64

Open
Sample-design-alt opened this issue Nov 15, 2024 · 0 comments
Open

a question about time series classification #64

Sample-design-alt opened this issue Nov 15, 2024 · 0 comments

Comments

@Sample-design-alt
Copy link

Sample-design-alt commented Nov 15, 2024

Thanks for your great works!
But I have some question about TSC task.

  1. Can time series classification really be unsupervised? According to your paper, use embedding and then SVM for classification, but isn't SVM also a supervised method?
  2. Also, I followed your tutorial and I found that using embedding for classification is not even as good as raw data in some datasets.
    (this is GestureMidAirD2 datasets)
    0582e12d5677a49743c4ed02d2a671b

my code:


`
from momentfm import MOMENTPipeline
from torch.utils.data import DataLoader
from momentfm.data.classification_dataset import ClassificationDataset
import numpy as np
model = MOMENTPipeline.from_pretrained(
"AutonLab/MOMENT-1-large",
model_kwargs={
'task_name': 'embedding',
'n_channels': 1,
'num_class': 5
},

)
model.init()

train_dataset = ClassificationDataset(data_split='train')
test_dataset = ClassificationDataset(data_split='test')
train_x = train_dataset.train_data.squeeze()
train_y = train_dataset.train_labels
test_x = test_dataset.test_data.squeeze()
test_y = test_dataset.test_labels

train_dataloader = DataLoader(train_dataset, batch_size=64, shuffle=False, drop_last=False)
test_dataloader = DataLoader(test_dataset, batch_size=64, shuffle=False, drop_last=False)
from tqdm import tqdm
import torch
def get_embedding(model, dataloader):
embeddings, labels = [], []
with torch.no_grad():
for batch_x, batch_masks, batch_labels in tqdm(dataloader, total=len(dataloader)):
batch_x = batch_x.to("cuda").float()
batch_masks = batch_masks.to("cuda")

        output = model(x_enc=batch_x, input_mask=batch_masks) # [batch_size x d_model (=1024)]
        embedding = output.embeddings
        embeddings.append(embedding.detach().cpu().numpy())
        labels.append(batch_labels)

embeddings, labels = np.concatenate(embeddings), np.concatenate(labels)
return embeddings, labels

model.to("cuda").float()

train_embeddings, train_labels = get_embedding(model, train_dataloader)
test_embeddings, test_labels = get_embedding(model, test_dataloader)

print(train_embeddings.shape, train_labels.shape)
print(test_embeddings.shape, test_labels.shape)

from momentfm.models.statistical_classifiers import fit_svm

clf = fit_svm(features=train_embeddings, y=train_labels)
clf1 = fit_svm(features=train_x, y=train_labels)

y_pred_train = clf1.predict(train_x)
y_pred_test = clf1.predict(test_x)
train_accuracy = clf1.score(tra in_x, train_labels)
test_accuracy = clf1.score(test_x, test_labels)

print(f"Train accuracy origin data: {train_accuracy:.2f}")
print(f"Test accuracy origin data: {test_accuracy:.2f}")

y_pred_train = clf.predict(train_embeddings)
y_pred_test = clf.predict(test_embeddings)
train_accuracy = clf.score(train_embeddings, train_labels)
test_accuracy = clf.score(test_embeddings, test_labels)

print(f"Train accuracy embedding by moment: {train_accuracy:.2f}")
print(f"Test accuracy embedding by moment: {test_accuracy:.2f}")
`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant