Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/read csv with time window filter #154

Merged
merged 6 commits into from
Nov 5, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 20 additions & 1 deletion timely_beliefs/beliefs/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -524,7 +524,7 @@ def set_reference(
)


def read_csv(
def read_csv( # noqa C901
path: str,
sensor: "classes.Sensor",
source: "classes.BeliefSource" = None,
Expand All @@ -535,6 +535,8 @@ def read_csv(
resample: bool = False,
timezone: Optional[str] = None,
filter_by_column: dict = None,
event_ends_after: datetime = None,
event_starts_before: datetime = None,
datetime_column_split: str | None = None,
transformations: list[dict] = None,
**kwargs,
Expand All @@ -561,6 +563,12 @@ def read_csv(
If not set and timezone naive datetimes are read in, the data is localized to UTC.
:param filter_by_column: Select a subset of rows by filtering on a specific value for a specific column.
For example: {4: 1995} selects all rows where column 4 contains the value 1995.
:param event_ends_after: Optionally, keep only events that end after this datetime.
Exclusive for non-instantaneous events, inclusive for instantaneous events.
Note that the first event may transpire partially before this datetime.
:param event_starts_before: Optionally, keep only events that start before this datetime.
Exclusive for non-instantaneous events, inclusive for instantaneous events.
Note that the last event may transpire partially after this datetime.
:param datetime_column_split: Optionally, help parse the datetime column by splitting according to some string.
For example:
"1 jan 2022 00:00 - 1 jan 2022 01:00"
Expand Down Expand Up @@ -652,6 +660,17 @@ def read_csv(
if not kwargs.get("keep_default_na", True):
df = df.dropna()

if event_ends_after:
if sensor.event_resolution == timedelta(0):
df = df[df["event_start"] + sensor.event_resolution >= event_ends_after]
else:
df = df[df["event_start"] + sensor.event_resolution > event_ends_after]
if event_starts_before:
if sensor.event_resolution == timedelta(0):
df = df[df["event_start"] <= event_starts_before]
else:
df = df[df["event_start"] < event_starts_before]

if resample:
df = resample_events(df, sensor)

Expand Down