Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FInancial COpilot to Qlib #1531

Draft
wants to merge 67 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
7c4f3b8
Initial interface for discussion
you-n-g May 24, 2023
f24253e
add openai interface support
peteryangms May 30, 2023
55611aa
Merge pull request #1527 from microsoft/xuyang1/add_openai_api_support
peteryang1 May 30, 2023
f376435
first round
peteryangms May 30, 2023
2af35d9
second commit
peteryangms May 30, 2023
ce39b4b
add qlib auto init so logger can display info
peteryangms May 30, 2023
74a5d7c
add parse method for summarization;
Cadenza-Li May 30, 2023
94102fb
remove tasktype variable
peteryangms May 31, 2023
421b140
Merge pull request #1528 from microsoft/xuyang1/refine_task_and_imple…
peteryang1 May 31, 2023
3919678
split task into workflow and task to make the strcture more clear
peteryangms May 31, 2023
e7cd93a
add base method for summarization; (#1530)
Fivele-Li May 31, 2023
08d9dbc
update v1 code containing SLplan and config action
peteryangms May 31, 2023
e2332a0
imporove some words in prompt
peteryangms May 31, 2023
cda32d5
Merge pull request #1532 from microsoft/xuyang1/add-plan-and-config-t…
peteryang1 Jun 1, 2023
0515524
add code implementation code
peteryangms Jun 1, 2023
d46b4c1
Merge pull request #1534 from microsoft/xuyang1/add_code_implementati…
peteryang1 Jun 1, 2023
5f37f32
update code
peteryangms Jun 1, 2023
e376648
Merge pull request #1536 from microsoft/xuyang1/add_debug_mode_to_sav…
peteryang1 Jun 1, 2023
40e0c32
Add configurable dataset (#1535)
you-n-g Jun 1, 2023
3b56b8e
Optimize summarize task prompt and others (#1533)
Fivele-Li Jun 1, 2023
73d51f0
Init workspace and CMDTask (#1537)
you-n-g Jun 1, 2023
ad7498e
Edit yaml task (#1538)
you-n-g Jun 1, 2023
1d88830
Add recorder task and visualize (#1542)
Fivele-Li Jun 12, 2023
01accec
update code
peteryangms Jun 12, 2023
80fbc00
move prompt templates to yaml file to make code clean
peteryangms Jun 13, 2023
429c9a7
format
peteryangms Jun 13, 2023
fa7ef29
Merge pull request #1548 from microsoft/xuyang1/add_dump_to_file_task
peteryang1 Jun 13, 2023
7762c5a
add datahandler and design action task according to component
peteryangms Jun 13, 2023
f9cc8a5
remove useless prompt
peteryangms Jun 14, 2023
1a523df
Optimize log and interact of FinCo (#1549)
Fivele-Li Jun 14, 2023
74619ed
fix using defaut in record strategy and backtest
peteryangms Jun 14, 2023
a70386a
Merge pull request #1550 from microsoft/xuyang1/refine_task_prompts
peteryang1 Jun 14, 2023
f12184c
Add analyser task and optimize interact (#1552)
Fivele-Li Jun 16, 2023
1326ac6
Add docs to context and retrieve (#1566)
Fivele-Li Jun 24, 2023
7e84f3a
Add backtest and backforward task (#1568)
Fivele-Li Jun 30, 2023
73bd79c
merge into one commit
peteryangms Jun 30, 2023
4fccf81
fix one workflow
peteryangms Jun 30, 2023
9119bcd
Merge pull request #1576 from microsoft/xuyang1/add_config_and_code_d…
peteryang1 Jun 30, 2023
6cb87ec
refine code to use qrun
peteryangms Jul 3, 2023
ee5e5cf
remove useless code
peteryangms Jul 3, 2023
b7757d5
Merge pull request #1580 from microsoft/xuyang1/refine_workflow_to_in…
peteryang1 Jul 3, 2023
9a36f8d
fix singleton bug
peteryangms Jul 4, 2023
8b0fdf1
Merge pull request #1581 from microsoft/xuyang1/fix_singleton_bug
peteryang1 Jul 4, 2023
aef1153
rename & test
you-n-g Jul 4, 2023
86ffd17
Add knowledge module and tune summarizeTask (#1582)
Fivele-Li Jul 6, 2023
effed38
Optimize prompt for entire learn loop (#1589)
Fivele-Li Jul 11, 2023
d7ab693
update knowledge module;
Cadenza-Li Jul 12, 2023
37d83fd
update knowledge module;
Cadenza-Li Jul 13, 2023
51a9403
Merge remote-tracking branch 'origin/main' into finco
you-n-g Jul 14, 2023
b9b6938
Merge branch 'finco' into update_knowledge_module
Fivele-Li Jul 14, 2023
e5f685c
merge all commit (#1593)
peteryang1 Jul 14, 2023
025859a
Merge branch 'finco' into update_knowledge_module
Fivele-Li Jul 14, 2023
a19e616
Update test_utils.py
you-n-g Jul 14, 2023
8a56cf6
add KnowledgeBase to workflow;
Fivele-Li Jul 14, 2023
5e0873c
Merge pull request #1592 from Fivele-Li/update_knowledge_module
peteryang1 Jul 16, 2023
1c9841b
Connect TrainTask & Rolling & DDG-DA (#1599)
you-n-g Jul 17, 2023
8c1905d
Optimize KnowledgeBase to complete workflow (#1598)
Fivele-Li Jul 17, 2023
b21e044
Fix find class bug (#1601)
you-n-g Jul 17, 2023
13c63ee
merge into one commit
peteryangms Jul 17, 2023
d909d54
Merge pull request #1603 from microsoft/xuyang1/add_idea_task
peteryang1 Jul 17, 2023
1c5a73a
small refinement in finance knowledge
peteryangms Jul 17, 2023
ce8cb51
hot fix one small bug in template
peteryangms Jul 18, 2023
8eb1293
Add prompt logger
you-n-g Jul 18, 2023
561086d
commit
peteryangms Jul 19, 2023
f93f331
Merge pull request #1609 from microsoft/xuyang1/finetune_prompts
peteryang1 Jul 19, 2023
70a066b
optimize workflow and output format
Fivele-Li Jul 20, 2023
5af99e1
optimize log (#1612)
Fivele-Li Aug 1, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ dist/
qlib/VERSION.txt
qlib/data/_libs/expanding.cpp
qlib/data/_libs/rolling.cpp
qlib/finco/prompt_cache.json
qlib/finco/finco_workspace/
qlib/finco/knowledge/*/knowledge.pkl
qlib/finco/knowledge/*/storage.yml
examples/estimator/estimator_example/
examples/rl/data/
examples/rl/checkpoints/
Expand Down
3 changes: 3 additions & 0 deletions qlib/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -486,5 +486,8 @@ def registered(self):
return self._registered


DEFAULT_QLIB_DOT_PATH = Path("~/.qlib/").expanduser()


# global config
C = QlibConfig(_default_config)
111 changes: 111 additions & 0 deletions qlib/contrib/analyzer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
import logging
import matplotlib.pyplot as plt
from pathlib import Path
import numpy as np

from ..log import get_module_logger
from ..contrib.eva.alpha import calc_ic, calc_long_short_return, calc_long_short_prec

logger = get_module_logger("analysis", logging.INFO)


class AnalyzerTemp:
def __init__(self, recorder, output_dir=None, **kwargs):
self.recorder = recorder
self.output_dir = Path(output_dir) if output_dir else "./"

def load(self, name: str):
"""
It behaves the same as self.recorder.load_object.
But it is an easier interface because users don't have to care about `get_path` and `artifact_path`

Parameters
----------
name : str
the name for the file to be load.

Return
------
The stored records.
"""
return self.recorder.load_object(name)

def analyse(self, **kwargs):
"""
Analyse data index, distribution .etc

Parameters
----------


Return
------
The handled data.
"""
raise NotImplementedError(f"Please implement the `analysis` method.")


class HFAnalyzer(AnalyzerTemp):
"""
This is the Signal Analysis class that generates the analysis results such as IC and IR.

default output image filename is "HFAnalyzerTable.jpeg"
"""

def __init__(self, **kwargs):
super().__init__(**kwargs)

def analyse(self):
pred = self.load("pred.pkl")
label = self.load("label.pkl")

long_pre, short_pre = calc_long_short_prec(pred.iloc[:, 0], label.iloc[:, 0], is_alpha=True)
ic, ric = calc_ic(pred.iloc[:, 0], label.iloc[:, 0])
metrics = {
"IC": ic.mean(),
"ICIR": ic.mean() / ic.std(),
"Rank IC": ric.mean(),
"Rank ICIR": ric.mean() / ric.std(),
"Long precision": long_pre.mean(),
"Short precision": short_pre.mean(),
}

long_short_r, long_avg_r = calc_long_short_return(pred.iloc[:, 0], label.iloc[:, 0])
metrics.update(
{
"Long-Short Average Return": long_short_r.mean(),
"Long-Short Average Sharpe": long_short_r.mean() / long_short_r.std(),
}
)

table = [[k, v] for (k, v) in metrics.items()]
plt.table(cellText=table, loc="center")
plt.axis("off")
plt.savefig(self.output_dir.joinpath("HFAnalyzerTable.jpeg"))
plt.clf()

plt.scatter(np.arange(0, len(pred)), pred.iloc[:, 0])
plt.scatter(np.arange(0, len(label)), label.iloc[:, 0])
plt.title("HFAnalyzer")
plt.savefig(self.output_dir.joinpath("HFAnalyzer.jpeg"))
return "HFAnalyzer.jpeg"


class SignalAnalyzer(AnalyzerTemp):
"""
This is the Signal Analysis class that generates the analysis results such as IC and IR.

default output image filename is "signalAnalysis.jpeg"
"""

def __init__(self, **kwargs):
super().__init__(**kwargs)

def analyse(self, dataset=None, **kwargs):
label = self.load("label.pkl")

plt.hist(label)
plt.title("SignalAnalyzer")
plt.savefig(self.output_dir.joinpath("signalAnalysis.jpeg"))

return "signalAnalysis.jpeg"
16 changes: 12 additions & 4 deletions qlib/contrib/data/handler.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.

from typing import Optional
from qlib.utils.data import update_config
from ...data.dataset.handler import DataHandlerLP
from ...data.dataset.processor import Processor
from ...utils import get_callable_kwargs
Expand Down Expand Up @@ -57,12 +59,13 @@ def __init__(
fit_end_time=None,
filter_pipe=None,
inst_processors=None,
data_loader: Optional[dict] = None,
**kwargs
):
infer_processors = check_transform_proc(infer_processors, fit_start_time, fit_end_time)
learn_processors = check_transform_proc(learn_processors, fit_start_time, fit_end_time)

data_loader = {
_data_loader = {
"class": "QlibDataLoader",
"kwargs": {
"config": {
Expand All @@ -74,12 +77,14 @@ def __init__(
"inst_processors": inst_processors,
},
}
if data_loader is not None:
update_config(_data_loader, data_loader)

super().__init__(
instruments=instruments,
start_time=start_time,
end_time=end_time,
data_loader=data_loader,
data_loader=_data_loader,
learn_processors=learn_processors,
infer_processors=infer_processors,
**kwargs
Expand Down Expand Up @@ -153,12 +158,13 @@ def __init__(
process_type=DataHandlerLP.PTYPE_A,
filter_pipe=None,
inst_processors=None,
data_loader: Optional[dict] = None,
**kwargs
):
infer_processors = check_transform_proc(infer_processors, fit_start_time, fit_end_time)
learn_processors = check_transform_proc(learn_processors, fit_start_time, fit_end_time)

data_loader = {
_data_loader = {
"class": "QlibDataLoader",
"kwargs": {
"config": {
Expand All @@ -170,11 +176,13 @@ def __init__(
"inst_processors": inst_processors,
},
}
if data_loader is not None:
update_config(_data_loader, data_loader)
super().__init__(
instruments=instruments,
start_time=start_time,
end_time=end_time,
data_loader=data_loader,
data_loader=_data_loader,
infer_processors=infer_processors,
learn_processors=learn_processors,
process_type=process_type,
Expand Down
20 changes: 20 additions & 0 deletions qlib/finco/.env.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@

OPENAI_API_KEY=your_api_key

# USE_AZURE=True
# AZURE_API_BASE=your_api_base
# AZURE_API_VERSION=your_api_version

# use gpt-4 means more token but more wait time
# MODEL=gpt-4
# MAX_TOKENS=1600
# MAX_RETRY=1000


MAX_TOKENS=1600
MAX_RETRY=120

CONTINOUS_MODE=True
DEBUG_MODE=True

# TEMPERATURE=
22 changes: 22 additions & 0 deletions qlib/finco/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# This is an experimental branch of "`FI`nancial `CO`pilot of `Qlib`"

## Installation

- To run this module, you need to first install Qlib following the instruction in [install-from-source](/README.md#install-from-source) or follow:

```python
python -m pip install git+https://github.com/microsoft/qlib.git@finco
```

- then you need to install other dependencies of finco:
```python
python -m pip install pydantic openai python-dotenv
```

## Quick run

To run this module, you can start the workflow easily with one command:

```sh
cd qlib/finco; python cli.py "your prompt"
```
13 changes: 13 additions & 0 deletions qlib/finco/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.
from pathlib import Path

DIRNAME = Path(__file__).absolute().resolve().parent


def get_finco_path() -> Path:
"""
return the template path
Because the template path is located in the folder. We don't know where it is located. So __file__ for this module will be used.
"""
return DIRNAME
15 changes: 15 additions & 0 deletions qlib/finco/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import fire
from qlib.finco.workflow import WorkflowManager
from dotenv import load_dotenv
from qlib import auto_init


def main(prompt=None):
load_dotenv(verbose=True, override=True)
wm = WorkflowManager()
wm.run(prompt)


if __name__ == "__main__":
auto_init()
fire.Fire(main)
15 changes: 15 additions & 0 deletions qlib/finco/cli_learn.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import fire
from qlib.finco.workflow import LearnManager
from dotenv import load_dotenv
from qlib import auto_init


def main(prompt=None):
load_dotenv(verbose=True, override=True)
lm = LearnManager()
lm.run(prompt)


if __name__ == "__main__":
auto_init()
fire.Fire(main)
32 changes: 32 additions & 0 deletions qlib/finco/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# TODO: use pydantic for other modules in Qlib
# from pydantic_settings import BaseSettings
from qlib.finco.utils import SingletonBaseClass

import os


class Config(SingletonBaseClass):
"""
This config is for fast demo purpose.
Please use BaseSettings insetead in the future
"""

def __init__(self):
self.use_azure = os.getenv("USE_AZURE") == "True"
self.temperature = 0 if os.getenv("TEMPERATURE") is None else float(os.getenv("TEMPERATURE"))
self.max_tokens = 800 if os.getenv("MAX_TOKENS") is None else int(os.getenv("MAX_TOKENS"))

self.openai_api_key = os.getenv("OPENAI_API_KEY")
self.use_azure = os.getenv("USE_AZURE") == "True"
self.azure_api_base = os.getenv("AZURE_API_BASE")
self.azure_api_version = os.getenv("AZURE_API_VERSION")
self.model = os.getenv("MODEL") or ("gpt-35-turbo" if self.use_azure else "gpt-3.5-turbo")

self.max_retry = int(os.getenv("MAX_RETRY")) if os.getenv("MAX_RETRY") is not None else None

self.continuous_mode = (
os.getenv("CONTINOUS_MODE") == "True" if os.getenv("CONTINOUS_MODE") is not None else False
)
self.debug_mode = os.getenv("DEBUG_MODE") == "True" if os.getenv("DEBUG_MODE") is not None else False
self.workspace = os.getenv("WORKSPACE") if os.getenv("WORKSPACE") is not None else "./finco_workspace"
self.max_past_message_include = int(os.getenv("MAX_PAST_MESSAGE_INCLUDE") or 6) // 2 * 2
Loading