-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nothing being pulled #76
Comments
I can confirm that I have been having the same issue since last week. |
I want to add that it appears to be a change on Duolingo's side. Duolingo Ninja (https://duolingoninja.com/) claims that something changed on March 9, which is right after I last successfully synced. |
Hi folks, just wanted to check in to acknowledge this issue. I'll be able
to take a look tomorrow!
…On Fri, Mar 15, 2024, 10:02 PM eriksolg ***@***.***> wrote:
Indeed they have changed something.
https://www.duolingo.com/vocabulary/overview returns 404.
—
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABO2Q4C2753PQW44TPXWV7DYYPG6FAVCNFSM6AAAAABEW7XCT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBRGY3TOMRRGM>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Alright! Thanks for reporting this, and I'm seeing the same things that you are. Given the previous recent changes that made login difficult and now this change, I've removed the plugin from the Ankiweb plugin index. That said, I'll continue to try to maintain it at least for a while here in this repository for manual installation, if a solution comes up to this recent change. My feeling is that Duolingo is probably just not interested in maintaining a stable interface for third-party plugins and that that makes it a bit too unstable to support for the general Anki community. It looks like the corresponding issue in the Duolingo library that I use in this plugin is KartikTalwar/Duolingo#139. I'll follow that to see if their community has a fix. Right now I'm kind of pessimistic that a fix will come out, but in any case it'll likely take a couple of weeks or more if one does come out. I'll continue to keep my eye out on this! |
Roger! Will take a look this weekend. :) |
Alright! I took a look. I'm still running into an error when I try integrating it, but my copy of the duolingo library has diverged from the Kartik's copy and so it seems likely the the cause is in that divergence. I haven't had much time to debug this this weekend, but if you have this method working in the Kartik's project then it seems very likely that I can debug any issue in my copy without too much work, hopefully next weekend. I'm doing my work in branch https://github.com/JASchilz/AnkiSyncDuolingo/tree/fix-from-gigajuwels if anyone else is wanting to take a look. And thanks @gigajuwels both for the fix and for notifying us in this project. :) |
Hey folks! Just wanted to let you know that I'm continuing to address this issue. I'm able to use @gigajuwels fix to retrieve vocabulary, however it's not quite a one-to-one replacement, so it might be a couple of weeks before I get it integrated and this issue closed. |
Hey there - I gave a go at fixing this - python is definitely not my forte so I could use some feedback. I also have never used Anki or your plugin in its previous working state so it was a fair bit of guess work and not sure how migration would go. Not sure if this is something you still have on your plate -- My solution mostly works (when I hardcode a few things it works great on my account) but I still need to figure out the proper user to pass login API I took from your gitlab and I am looking at how to provide/set the learnLanguage. I also added the audio we get from the new vocab API to the Hoping to get these things polished up in a few days if anything for my personal use but happy to open a PR and get feedback too if it's helpful. |
Hey @tomtaylz , thanks for this and your work on moving this issue forward. There's a couple of challenges here that I'm working with:
Gimme one more day to try to work this out, and then I might accept your help. :) |
OK folks, I've created a release candidate which I can use to pull my words from Duolingo. Check out https://github.com/JASchilz/AnkiSyncDuolingo/releases/tag/3.0.0rc1 and the installation instructions there. Perhaps the biggest disruptive change, in response to Duolingo's own changes to their service in March, is that the plugin is unable to de-duplicate any cards pulled using a previous version of this plugin. I figure that if this is a hurdle to jump, maybe users can share any strategies here for how to address this. Also note that I'm unable to retrieve any new words until after you've completed a lesson. In other words, the plugin won't retrieve words after the first time that you encounter them, it will only retrieve words from lessons that you've completed. I'm also creating an issue to help set expectations about my ability to maintain this plugin and invite any new maintainer who would like to take it on: https://github.com/JASchilz/AnkiSyncDuolingo/issues . |
Thanks @JASchilz I'll cherry pick some of your work to get my local branch in a better state too. I'm happy to help at least in the short term but full disclosure around EOY there's a likelihood I'll drop off more too (aggressively chasing a language right now to prepare for a trip) - I could create a PR for some cleanup and for my audio additions for feedback. |
@tomtaylz seeing your changes would be appreciated. I'd consider posting them more for posterity and not for hopes of getting them merged by me. In #78 I call out that I'm not able to give this project the full attention that I'd like to as owner, and invite a new maintainer. Having your code available could be interesting for that new maintainer, but I'm unlikely to be able to give it the diligence of a full review. |
Do I need to be added for permissions to write? On a new computer so just want to verify its not something wrong with my setup before I go down that rabbit hole but I hit a 403 :) |
@tomtaylz oh, how did you encounter the 403? I don't quite understand the scenario yet. :) If you were asking about contributing your branch to the repository, I was thinking more along the lines of if you've got a fork on GitHub of this project you can link to it here so that it's available for other people to see in the future. :) |
ah, I was just trying to push a branch and submit a PR for feedback, but happy to fork and take that approach also. |
Here's the addition of the audio player - #80 |
Hi BNaturelle, have you tried the release candidate, linked at
#76 (comment)
? :) Hope it gets this working again for you. :)
…On Tue, Jul 9, 2024, 2:46 PM BNaturelle ***@***.***> wrote:
I'm pretty new to API but I really loved this library before the API
change and want to help however I can
https://www.duolingo.com/2017-06-30/users/{user_id}/courses/{short_language}/en/practice-lexemes
works similarly to the POST "learned-lexemes" from the new get_vobabulary
method. If you send in the skillId's one at a time, you can get the
vocabulary of that specific unit. It also returns lexemeId's like the old
API but only for the first 5 lexemes for some reason. They can also be
grouped together with unit tags in the Anki deck since they all come from
the same skill. It may be possible to get the lexemeId's with different
arguments/json payloads, but I haven't found the golden ticket yet.
image.png (view on web)
<https://github.com/JASchilz/AnkiSyncDuolingo/assets/175161541/3dc87a73-9603-4f7b-88d4-85ed320128de>
https://www.duolingo.com/2017-06-30/words-list/supported-courses
returns a json of duolingo courses offered in each language. Not useful on
it's own, but there may be an accessible node hidden somewhere.
I hope someone smarter than me finds this information useful and helpful
in restoring this library
—
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABO2Q4ENM3GWB7JGYPYJWYLZLRK3VAVCNFSM6AAAAABEW7XCT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJYG44DIMRXGE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@JASchilz Thanks a ton for the RC. This plugin is pretty much the only motivation for me to continue using Duolingo. Hopefully somebody would volunteer for continuing maintaining it. |
3.0.0rc1 works fine, but I'm working to address some of the four "breaking changes" that you listed The plugin has no means to avoid pulling words that you already pulled using a previous version of the addon I have no idea about the gender or pronounciation fields. We can hope that they'll be accesible from the same endpoint as the lexemeIds like before. Maybe that's just wishful thinking. This will only pull learned words from completed lessons. current_courses = self._get_data_by_user_id()["currentCourse"]["pathSectioned"]
progressed_skills_Ids = []
progressed_skills = []
for section in current_courses:
completedUnits = section["completedUnits"]
units = section["units"]
for unitIndex in range(len(units)):
unit = units[unitIndex]
if unitIndex>completedUnits:
break
levels = unit["levels"]
for level in levels:
if level['type'] != 'skill':
continue
pathLevelClientData = level["pathLevelClientData"]
if "skillId" in pathLevelClientData:
levelSkill = [pathLevelClientData['skillId']]
elif "skillIds" in pathLevelClientData:
levelSkill = pathLevelClientData["skillIds"]
else:
levelSkill = []
for levelSkillId in levelSkill:
if levelSkillId not in progressed_skills_Ids:
progressed_skills_Ids.append(levelSkillId)
if unitIndex < completedUnits:
finishedLevels = 1
finishedSessions = 1234
else:
finishedLevels = 0
finishedSessions = level["finishedSessions"]
new_obj = {
"finishedLevels": finishedLevels,
"finishedSessions": finishedSessions,
"skillId": {
"id": levelSkillId
}
}
progressed_skills.append(new_obj)) This will create a less bloated list of all progressed skills from partially and fully completed units. |
Got it, @BNaturelle, your help is appreciated! Am I right in understanding that the flow would be:
Regarding de-duplicating, I suspect that some other kind of solution might be necessary. For example, is there someone that has a plugin that allows you to merge cards somehow based on a field match? If so, it might be possible to pull down the new cards and merge the new GID values onto your old cards. Then the new version of the plugin would be able to de-duplicate them. |
You're partially right. The code snippet does indeed generate a list of partially and fully completed lexemes, and is intended to be used in the payload of a POST request to the learned-lexemes endpoint. It can also be used against the practice-lexemes endpoint, but it returns a vocab list that is much shorter and less useful than the previous endpoint. The get_vocabulary method written by gigajuwels works fine, but it crawls lesson-type levels as well as practice-type levels (even though lessons are the only ones that introduce new skills), leading to repeated, unnecessary entries in the list. It also only crawls levels in completed units, ignoring completed levels in partially complete units. The snippet provided is a small improvement as it crawls only lesson-type levels, and includes levels in partially completed units. Sorry if I'm suggesting changes here in an inconvenient way. I'm new to github/collaborative coding in general and don't know how to make forks or suggested changes yet. But, the snippet is specifically to replace lines 387-422 of duolingo.py in version 3.0.0rc1. |
Got it. I did give this a shot and was able to get in-progress words from
it.
To help set expectations I'm working long hours at my day job right now. So
it will take me at least till next weekend to integrate this. But I do feel
"spun back up" on this project so it shouldn't take much extra time.
Your code looked and worked great. If you were to figure out how to do a
merge request, I'd accept it.
And in either case I can work in the part about hitting both endpoints.
…On Sun, Jul 14, 2024, 5:37 PM Paige Genest ***@***.***> wrote:
You're partially right. The code snippet does indeed generate a list of
partially and fully completed lexemes, and is intended to be used in the
payload of a POST request to the learned-lexemes endpoint. It can also be
used against the practice-lexemes endpoint, but it returns a vocab list
that is much shorter and less useful than the previous endpoint.
The get_vocabulary method written by gigajuwels works fine, but it crawls
lesson-type levels as well as practice-type levels (even though lessons are
the only ones that introduce new skills), leading to repeated, unnecessary
entries in the list. It also only crawls levels in completed units,
ignoring completed levels in partially complete units. The snippet provided
is a small improvement as it crawls only lesson-type levels, and includes
levels in partially completed units.
Sorry if I'm suggesting changes here in an inconvenient way. I'm new to
github/collaborative coding in general and don't know how to make forks or
suggested changes yet. But, the snippet is specifically to replace lines
387-422 of duolingo.py in version 3.0.0rc1.
—
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABO2Q4E4QZBE7QBMG4N5OJLZMMKVLAVCNFSM6AAAAABEW7XCT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRXGU2DINRVHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
thx for your services |
@BNaturelle I believe I've integrated your code into the latest release candidate: https://github.com/JASchilz/AnkiSyncDuolingo/releases/tag/3.0.0rc2 . However, once I started developing on this I found that the same practice words were showing up on both the practice and learned endpoints. I didn't expect that and I'm not sure what the cause is, but I added some de-duplication code so that at least it wouldn't result in multiple notes for the same word. You can take a look at the latest version of that code in this branch: https://github.com/JASchilz/AnkiSyncDuolingo/blob/fix-from-new-library/duolingo_sync/duolingo/duolingo.py#L375-L499 |
This is expected. The practice-lexemes endpoint has several challenges that I havent addressed yet.
The practice-lexemes endpoint is less useful in almost every way. The only strength (and the reason I brought it up in the first place) is because it is the only reliable way of generating a word bank with lexemeIds like the old API did. It worked best as a unique identifier, and didn't have issues with homographs like the new "{word}-{lang}" identifiers do. Getting a full list of lexemeIds seems to be like a pipe dream at this point, mainly because of the unintuitive and undocumented nature of Duo's new API. To make matters worse, in new lessons (https://www.duolingo.com/2017-06-30/sessions) the lexemes are now called "kc"s with "kc_Ids" - and lexemIds are labelled "legacy_id"s of type "lex". I don't even know what kc means or why it's replacing lexemes, but I'm hoping that it means they're transforming their word bank's back end in some meaningful way. I'll probably check back every month or two to see if Duo implemented anything new that can be scraped, but otherwise I don't see how to improve from here. I haven't been able to scrape gender, pronunciation, etc at all. Adding lesson tags is very doable, but maybe not something you want to bother implementing. I'd be willing to give it a try if that's something that interests you. |
Thanks for that summary and your spelunking of this API. I'll give
skill-by-skill iteration (and skill tagging) a shot.
…On Mon, Jul 22, 2024, 11:05 AM Paige Genest ***@***.***> wrote:
This is expected. The practice-lexemes endpoint has several challenges
that I havent been able to address.
1. It only returns a maximum of 29 lexemes and does not seems to
support pagination or longer outputs (like the learned-lexemes does with
?startIndex= or ?limit=). Although it takes the same payload, they do not
seem to take the new arguments. In fact, the POST to practice-lexemes on
Duolingo.com uses no arguments at all
2. It seems to return a random list of learned words from all
completed lessons. This can produce a more useful list by parsing the
progressed_Skills list and running it once for each skill/lesson, then
concatenating the ouptuts. However, if there are more than 29 new words in
that skill/lesson, it will return an incomplete list of words. This
approach is also more than 7 times slower than learned-lexemes which
returns a maximum of 200 words (as opposed to 29)
The practice-lexemes endpoint is less useful in almost every way. The only
strength (and the reason I brought it up in the first place) is because it
is the only reliable way of generating a word bank with lexemeIds like the
old API did. It worked best as a unique identifier, and didn't have issues
with homographs like the new "{word}-{lang}" identifiers do. Getting a full
list of lexemeIds seems to be like a pipe dream at this point, mainly
because of the unintuitive and undocumented nature of Duo's new API. To
make matters worse, in new lessons (
https://www.duolingo.com/2017-06-30/sessions) the lexemes are now called
"kc"s with "kc_Ids" - and lexemIds are labelled "legacy_id"s of type "lex".
I don't even know what kc means or why it's replacing lexemes, but I'm
hoping that it means they're transforming their word bank's back end in
some meaningful way.
I'll probably check back every month or two to see if Duo implemented
anything new that can be scraped, but otherwise I don't see how to improve
from here. I haven't been able to scrape gender, pronunciation, etc at all.
Adding lesson tags is very doable, but maybe not something you want to
bother implementing. I'd be willing to give it a try if that's something
that interests you.
—
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABO2Q4E4SWISWBESGK4GS3LZNVCWLAVCNFSM6AAAAABEW7XCT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBTGUZDINBXGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Glad I could help. The unit/skill names are listed in many different subfolders, but I've found the following to be easiest to crawl: |
A week after the last time I used Pull from Duolingo, I've finished another unit and it's time to pull the latest new cards.
I successfully logged in, then it started pulling vocabulary. However, I got an error message that says "Expected value: line 1 column 1 (char 0)".
I've tried a couple times, getting the same result. I also thought that maybe this was due to a lack of cards to download, so I deleted some cards I suspended. Even after the cards were deleted, I still got the same error. (Don't worry about my studying; these were duplicate words that Duolingo listed twice.)
Has Duolingo screwed around with their system again? Or am I just having a bad day?
The text was updated successfully, but these errors were encountered: