You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Bug] When Spark reads Doris data as a DataFrame based on time conditions, if the time condition data keeps growing, updating the read DataFrame data will cause inaccurate updates
#222
Open
2 of 3 tasks
Kris1314Love opened this issue
Jul 30, 2024
· 0 comments
When Spark reads Doris data as a DataFrame based on time conditions, if the time condition data keeps growing, updating the read DataFrame data will cause inaccurate updates
What You Expected?
fix this bug
How to Reproduce?
Doris Spark connector reads Doris data as a DataFrame based on time conditions. At this time, a routineload is writing data to the table where the DataFrame is located with the same time condition. Then, it writes the read DataFrame data to other tables in Doris and updates a column of data in the DataFrame to be written back to the original table in Doris. However, it is found that the number of data written to other tables does not match the number of data updated
Search before asking
Version
Spark Doris Connector 3.1_2.12
Spark 3.4.0
Doris 2.0.0
What's Wrong?
When Spark reads Doris data as a DataFrame based on time conditions, if the time condition data keeps growing, updating the read DataFrame data will cause inaccurate updates
What You Expected?
fix this bug
How to Reproduce?
Doris Spark connector reads Doris data as a DataFrame based on time conditions. At this time, a routineload is writing data to the table where the DataFrame is located with the same time condition. Then, it writes the read DataFrame data to other tables in Doris and updates a column of data in the DataFrame to be written back to the original table in Doris. However, it is found that the number of data written to other tables does not match the number of data updated
Anything Else?
doris- spark-connector按照时间条件读取Doris数据为DataFrame、此时有routineload正在往DataFrame所在表里相同时间条件写数据、然后将读取到的DataFrame数据写入Doris其他表、并更新DataFrame中一列的数据写回Doris原始表里、会发现写入到其他表的数据条数和更新的数据条数不一致
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: