Neural Click Models #8

arabel1a · 2024-11-24T22:13:19Z

No description provided.

pyproject.toml

sim4rec/response/nn_response.py

monkey0head · 2024-12-02T08:22:15Z

sim4rec/response/nn_response.py

+            print("Warning: the historical data is empty")
+            hist_data = spark.createDataFrame([], schema=SIM_LOG_SCHEMA)
+        # filter users whom we don't need
+        hist_data = hist_data.join(new_recs, on="user_idx", how="inner").select(


what is going on here? why do you join all new_recs columns? you need to take at least new_recs.select("user_idx").distinct() and do not select(hist_data["*"])) after

Wow, this is really a huge bug. I think, it is an artifact of one of the intermediate versions, where I tried to work with tables whose user_idx is unique and each row represent the whole itertaion. I'll fix it.

seems the fix was wrong, see the new suggestion #8 (comment)

monkey0head · 2024-12-02T08:23:05Z

sim4rec/response/nn_response.py

+            print("Warning: the simulator log is empty")
+            simlog = spark.createDataFrame([], schema=SIM_LOG_SCHEMA)
+        # filter users whom we don't need
+        simlog = simlog.join(new_recs, on="user_idx", how="inner").select(simlog["*"])


same as above

#8 (comment)

monkey0head · 2024-12-02T08:28:35Z

sim4rec/response/nn_response.py

+            )
+        )
+
+        # not very optimal way, it makes one worker to


need to discuss. you batch id should not influence the partitioning. one partition != one batch and the users are grouped to batches within one partition. do not now how to implement it for now.

BatchID won't influence the partition, because each batch must consist of the whole interaction history of a specific group of users. I

ok, i see, just remove the comment

monkey0head · 2024-12-02T12:29:47Z

sim4rec/response/nn_response.py

+        self.backbone_response_model = None
+
+    def _fit(self, train_data):
+        """


pls describe the dataframe format here and for transform. what should be included to properly convert dataframe to the RecommendationData. pls add corresponding docstrings

It is exactly the same as the simulator logs format. Please give me advice, where I can obtain it's description.

monkey0head · 2024-12-02T16:22:59Z

Thank you for your contribution! Please, have a look at the comments and add a time measurements to the notebook to show the speed of the main stages of simulation pipeline.

Veronika-Ivanova · 2024-12-18T10:32:46Z

sim4rec/response/nn_response.py

+        """
+        Predict responses for given dataframe with recommendations.
+
+        :param dataframe: new recommendations.


the param name is not correct, should be new_recs

monkey0head · 2024-12-20T08:20:44Z

sim4rec/response/nn_response.py

+            print("Warning: the historical data is empty")
+            hist_data = spark.createDataFrame([], schema=SIM_LOG_SCHEMA)
+        # filter users whom we don't need
+        hist_data = hist_data.join(new_recs, on="user_idx", how="semi")


If you really want to leave the history of only distinct users from new_recs in hist_data.

Suggested change

hist_data = hist_data.join(new_recs, on="user_idx", how="semi")

hist_data = hist_data.join(sf.broadcast(new_recs.select("user_idx").distinct()), on="user_idx", how="inner")

monkey0head · 2024-12-20T08:21:45Z

sim4rec/response/nn_response.py

+            print("Warning: the simulator log is empty")
+            simlog = spark.createDataFrame([], schema=SIM_LOG_SCHEMA)
+        # filter users whom we don't need
+        simlog = simlog.join(new_recs, on="user_idx", how="semi")


same as for hist data...

Suggested change

simlog = simlog.join(new_recs, on="user_idx", how="semi")

simlog = simlog.join(sf.broadcast(new_recs.select("user_idx").distinct()), on="user_idx", how="inner")

arabel1a added 9 commits October 1, 2024 09:45

update dependencies

21ac50a

fix recent jupyter issue

1295e6d

dockerfile for cuda 10.2

5e78916

save intermediate progress

4fd2589

working version

4a0e26c

rename

b787cc4

merge main

aa0c608

notebook move to notebooks, changed model to SlatewiseTransformer

9954b7d

delete dockerfile

e23704c

monkey0head requested changes Dec 2, 2024

View reviewed changes

pyproject.toml Show resolved Hide resolved

monkey0head reviewed Dec 2, 2024

View reviewed changes

sim4rec/response/nn_response.py Outdated Show resolved Hide resolved

monkey0head reviewed Dec 2, 2024

View reviewed changes

after-review fix

a7b1325

Veronika-Ivanova reviewed Dec 18, 2024

View reviewed changes

monkey0head reviewed Dec 20, 2024

View reviewed changes

arabel1a and others added 5 commits December 24, 2024 13:35

clean up comments

e54cb85

update docstring for NNTransformer

74bbc31

cleanup

8412683

black

355f1f0

Update embeddings.py

ef41562

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Neural Click Models #8

Neural Click Models #8

arabel1a commented Nov 24, 2024

monkey0head Dec 2, 2024 •

edited

Loading

monkey0head Dec 2, 2024

arabel1a Dec 16, 2024

arabel1a Dec 16, 2024

monkey0head Dec 20, 2024

monkey0head Dec 2, 2024

monkey0head Dec 2, 2024

arabel1a Dec 16, 2024

monkey0head Dec 20, 2024

monkey0head Dec 2, 2024

arabel1a Dec 16, 2024

monkey0head Dec 20, 2024

monkey0head Dec 2, 2024 •

edited

Loading

arabel1a Dec 16, 2024 •

edited

Loading

monkey0head commented Dec 2, 2024

Veronika-Ivanova Dec 18, 2024 •

edited

Loading

monkey0head Dec 20, 2024

monkey0head Dec 20, 2024

	hist_data = hist_data.join(new_recs, on="user_idx", how="semi")
	hist_data = hist_data.join(sf.broadcast(new_recs.select("user_idx").distinct()), on="user_idx", how="inner")

	simlog = simlog.join(new_recs, on="user_idx", how="semi")
	simlog = simlog.join(sf.broadcast(new_recs.select("user_idx").distinct()), on="user_idx", how="inner")

Neural Click Models #8

Are you sure you want to change the base?

Neural Click Models #8

Conversation

arabel1a commented Nov 24, 2024

monkey0head Dec 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

monkey0head Dec 2, 2024 • edited Loading

Choose a reason for hiding this comment

arabel1a Dec 16, 2024 • edited Loading

Choose a reason for hiding this comment

monkey0head commented Dec 2, 2024

Veronika-Ivanova Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

monkey0head Dec 2, 2024 •

edited

Loading

monkey0head Dec 2, 2024 •

edited

Loading

arabel1a Dec 16, 2024 •

edited

Loading

Veronika-Ivanova Dec 18, 2024 •

edited

Loading