-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No longer downloading anything, only empty folder structure #587
Comments
@EugeneLoy might be connected to your research and contributions in #559 before, I'm also getting empty folders on two courses that I started, quite strange. Coronavirus maybe? |
Here's mine:
Basically zero files are downloaded @EugeneLoy help us :-) |
Experiencing thesame issue and https://lagunita.stanford.edu/ doesn't work either |
yes, lagunita is not working. would be a good fix right now since stanford will close access to all its courses by the end of march |
Edx-dl stopped working (it downloads empty folders now) because of a major update being pushed to the edx servers a week ago. Please fix this valuable tool, thank you! |
have the same problem, please help |
Unfortunately having the same issue. |
Having the same issue |
This comment has been minimized.
This comment has been minimized.
not working after login to my courses . It does not download |
Just for the record, I don't plan on working on edx-dl in the foreseeable
future.
New maintainers are more than welcome to jump in, if other members of the
project agree with this.
Regards,
Rogério Brito.
…On Mon, Feb 10, 2020 at 3:59 PM Crystyx ***@***.***> wrote:
February 2020: edx-dl -> DEAD.
Let's hope it's revived soon.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#587?email_source=notifications&email_token=AABTZMIHRKSJNO7WH76N5L3RCGPX3A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELJ2IHY#issuecomment-584295455>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABTZMPAXHPT4PCWHZKOCIDRCGPX3ANCNFSM4KRS4X4Q>
.
--
Rogério Brito : rbrito@{ime.usp.br,gmail.com} : GPG key 4096R/BCFCAAAA
http://cynic.cc/blog/ : github.com/rbrito : profiles.google.com/rbrito
DebianQA: http://qa.debian.org/developer.php?login=rbrito%40ime.usp.br
|
I wanted to do this too and managed to fix it for Stanford at least.
It's just a small modification to some of the code that gets the section elements to get them properly. I've only tested this on Stanford and it works. |
Tested on Amazon AWS DynamoDB course unfortunately still not working. Sad, such a nice tool. Thanks, @rbrito for the time and effort, hope someone can keep it going. |
I was having this issue with edX (courses.edx.org). The following change seems to have fixed it, but I don't have time to make a PR presently: --- a/edx_dl/parsing.py
+++ b/edx_dl/parsing.py
@@ -382,13 +382,13 @@ class NewEdXPageExtractor(CurrentEdXPageExtractor):
def _make_subsections(section_soup):
try:
- subsections_soup = section_soup.select("li.vertical.outline-item.focusable")
+ subsections_soup = section_soup.select("li.subsection")
except AttributeError:
return []
# FIXME correct extraction of subsection.name (unicode)
subsections = [SubSection(position=i,
url=s.a['href'],
- name=s.a.div.div.string.strip())
+ name=s.a.h4.string.strip())
for i, s in enumerate(subsections_soup, 1)] |
Just to confirm, this worked for me as well, thank you @bi1yeu. |
It doesn't work for me, I still get empty folder structure from this course https://courses.edx.org/courses/course-v1:NYUx+FCS.OS.1+1T2020/course/ |
Was a bit too fast. That change will download the youtube videos for me, but all PDFs are empty/corrupt. |
Interesting, PDFs for my course properly downloaded. If you're comfortable with python I would suggest dropping a breakpoint in that function and seeing whether the PDF hrefs match what you see in the browser. I am guessing the subsection DOM tags for your course don't exactly match my change in all cases.
…On Wed, Feb 12, 2020, at 3:47 AM, simonbogh wrote:
Was a bit too fast. That change will download the youtube videos for me, but all PDFs are empty/corrupt.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#587?email_source=notifications&email_token=AAJBK2MA7DEYMI7VIAV7WPLRCPOULA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELQPMTI#issuecomment-585168461>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJBK2KGKIL2YW532VPBDNLRCPOULANCNFSM4KRS4X4Q>.
|
I've used bi1yeu's change to download courses, i get the videos, subtitles and pdf's, however, for the course https://courses.edx.org/courses/course-v1:UTAustinX+UT.9.01x+1T2019/course/ 2 out of 10 sections are empty ("06-Week_2_Extra_Puzzles' and "08-Week_4-_Dodgeball"). Thanks for the change! |
Hello, |
Search for the parsing.py file on your hdd, it should be in a folder edx-dl (mine it's in D:\ProgramData\Anaconda3\Lib\site-packages\edx_dl). Credit goes to bi1yeu. |
You can download edX courses with thisprogram also: |
What course are you trying to download? I can check it. |
Ok guys calm down. I am a co maintainer of this project and I don't want to let the project die but I will need your help since I have not been around for a while. Can you please point me to exact PR numbers that fix the current issues. if fixes are in the form of commens can you please guys take them and I open PRs with them so I can review/merge them (or point me to the exact code). |
@iemejia Hey Ismaël -- I left a comment with a small change to the parser that seemed to fix the issue for my course. I am hesitant to make a PR for it since, based on subsequent comments from others, I don't believe it fixes the current issue 100%. |
@iemejia I've sent you en email with credentials, please check. |
I am using the modified file: https://github.com/Crystyx/parsing.py But I still get empty folders by trying to download |
Acknowledged @balta2ar info received. I will take a look at the issues during the weekend. |
I have tried edx-dl -u [email protected] https://courses.edx.org/courses/course-v1:USMx+BUMM612+1T2020/course/ --prefer-cdn-videos -s --file-formats rar,zip,docx,doc,xls,xlsx,ppt,pptx,ods,odt,pdf,e?ps,txt,odp,odg,gz,xz,html,7z Course gets downloaded until 04-Module_3, when an youtube-dl related error stops the download. Without the option --prefer-cdn-videos i get the same error. |
would you provide an cloud-download for me? :) |
@danielx11 I try to download your course with @bi1yeu 's fix and everything work. |
Guys have not had yet to tackle this, but great to see action going on, PRs are welcome ! |
I appreciate your help. I have already tried that... same issue :( |
you should change folder to source code folder,then run this : |
YES! Thanks a lot. That was the actual problem. Now it works :) Just to be clear: I only get the embedded content of the course and not the text on the pages – is that correct? So I will need to copy and paste the text additionally? |
I don't know……when I download other course, it only could videos,source code zip,pdf files,subtitles.I think download the text on pages is impossible. |
working now for me with python edx-dl.py
…________________________________
From: singleDog <[email protected]>
Sent: Friday, February 21, 2020 2:25 PM
To: coursera-dl/edx-dl <[email protected]>
Cc: crashoverburn <[email protected]>; Comment <[email protected]>
Subject: Re: [coursera-dl/edx-dl] No longer downloading anything, only empty folder structure (#587)
hm... I dont get it 😂
do I have to use another command? I use this:
edx-dl -u [USER] [COURSE URL]
you should change folder to source code folder,then run this :
python3 edx-dl.py -u your-account course-url
when you run "edx-dl -u [USER] [COURSE URL]",you just run the wrong binary edition
YES! Thanks a lot. That was the actual problem. Now it works :)
However, I only get two PDFs and several folders called overview with just videos. Different to the empty folders – they were structures and named correctly. But with this file: https://github.com/Crystyx/parsing.py AND the correct way to run the app it works fine.
Just to be clear: I only get the embedded content of the course and not the text on the pages – is that correct? So I will need to copy and paste the text additionally?
I don't know……when I download other course, it only could videos,source code zip,pdf files,subtitles.I think download the text on pages is impossible.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#587?email_source=notifications&email_token=AF2POIBGK2JQBMAMVGMAZATRD7P5NA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMS3UAY#issuecomment-589675011>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AF2POIHMVEEMRN3RCPA7YETRD7P5NANCNFSM4KRS4X4Q>.
|
Perfect! It seems to work.
I´m on a very slow connection. So still downloading MP4.
Type: python edx-dl.py -u [username] [course url]
it works!!!
…On Fri, Feb 21, 2020 at 3:39 PM crashoverburn ***@***.***> wrote:
working now for me with python edx-dl.py
________________________________
From: singleDog ***@***.***>
Sent: Friday, February 21, 2020 2:25 PM
To: coursera-dl/edx-dl ***@***.***>
Cc: crashoverburn ***@***.***>; Comment <
***@***.***>
Subject: Re: [coursera-dl/edx-dl] No longer downloading anything, only
empty folder structure (#587)
hm... I dont get it 😂
do I have to use another command? I use this:
edx-dl -u [USER] [COURSE URL]
you should change folder to source code folder,then run this :
python3 edx-dl.py -u your-account course-url
when you run "edx-dl -u [USER] [COURSE URL]",you just run the wrong binary
edition
YES! Thanks a lot. That was the actual problem. Now it works :)
However, I only get two PDFs and several folders called overview with just
videos. Different to the empty folders – they were structures and named
correctly. But with this file: https://github.com/Crystyx/parsing.py AND
the correct way to run the app it works fine.
Just to be clear: I only get the embedded content of the course and not
the text on the pages – is that correct? So I will need to copy and paste
the text additionally?
I don't know……when I download other course, it only could videos,source
code zip,pdf files,subtitles.I think download the text on pages is
impossible.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<
#587?email_source=notifications&email_token=AF2POIBGK2JQBMAMVGMAZATRD7P5NA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMS3UAY#issuecomment-589675011>,
or unsubscribe<
https://github.com/notifications/unsubscribe-auth/AF2POIHMVEEMRN3RCPA7YETRD7P5NANCNFSM4KRS4X4Q>.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#587?email_source=notifications&email_token=AN5CFAQGDNHEAU6B2MEEUNLREANWNA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMTVOLQ#issuecomment-589780782>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AN5CFATH36Z5FOFIZ5ZS3WDREANWNANCNFSM4KRS4X4Q>
.
|
sorry I am new at this, what do you exactly mean when you say:
CHANGE FOLDER TO SOURCE CODE FOLDER
El vie., 21 feb. 2020 a las 15:36, Yuri Bochkarev (<[email protected]>)
escribió:
… Closed #587 <#587> via #588
<#588>.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#587?email_source=notifications&email_token=AOPC4WBHXDR3AE2G2HGHOQLREA3K7A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWZ77P6A#event-3061839864>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q>
.
--
Atentamente, Walter Milen Ruelas Huanca
|
first open cmd (windows) or terminal (linux or mac), you could download souce code with "git clone https://github.com/coursera-dl/edx-dl.git",then there is a folder named "edx-dl".
|
try "python edx-dl.py ... etc etc" if python3 doens´t work.
Mine worked with python.
…On Fri, Feb 21, 2020 at 10:09 PM singleDog ***@***.***> wrote:
抱歉,我是新来的,当您说:将文件夹更改为源代码文件夹时,您的确切意思是2月21日,埃尔维尔。2020年15点36分,尤里·博卡卡列夫(
***@***.***)
… <#m_-268619700754803964_>
通过#588 <#588> < #588
<#588> > 关闭#587
<#587> < #587
<#587>
>。—因为评论,您收到此消息。回复此电子邮件直接,查看它在GitHub < #587
<#587>?email_source
=通知&email_token =
AOPC4WBHXDR3AE2G2HGHOQLREA3K7A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWZ77P6A#事件3061839864>,或取消订阅<
https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q
>。 <#588>
<#588>
<#587>
<https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q>
-Atentamente,Walter Milen Ruelas Huanca
first open cmd (windows) or terminal (linux or mac), you could download
souce code with "git clone https://github.com/coursera-dl/edx-dl.git",then
there is a folder named "edx-dl".
run this:
cd edx-dl
python3 edx-dl.py -u your-account course-url
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#587?email_source=notifications&email_token=AN5CFAQJCOKRU54V7DTB6H3REB3LXA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMUSJNQ#issuecomment-589898934>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AN5CFARFJFYWFWCAMKOKVMLREB3LXANCNFSM4KRS4X4Q>
.
|
thanks a lot, it worked with python
El vie., 21 feb. 2020 a las 21:24, Diogo Magliano (<[email protected]>)
escribió:
… try "python edx-dl.py ... etc etc" if python3 doens´t work.
Mine worked with python.
On Fri, Feb 21, 2020 at 10:09 PM singleDog ***@***.***>
wrote:
> 抱歉,我是新来的,当您说:将文件夹更改为源代码文件夹时,您的确切意思是2月21日,埃尔维尔。2020年15点36分,尤里·博卡卡列夫(
> ***@***.***)
> … <#m_-268619700754803964_>
> 通过#588 <#588> < #588
> <#588> > 关闭#587
> <#587> < #587
> <#587>
> >。—因为评论,您收到此消息。回复此电子邮件直接,查看它在GitHub < #587
> <#587>?email_source
> =通知&email_token =
>
AOPC4WBHXDR3AE2G2HGHOQLREA3K7A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWZ77P6A#事件3061839864>,或取消订阅<
>
>
https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q
> >。 <#588>
> <#588>
> <#587>
> <
https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q
>
> -Atentamente,Walter Milen Ruelas Huanca
>
> first open cmd (windows) or terminal (linux or mac), you could download
> souce code with "git clone https://github.com/coursera-dl/edx-dl.git
",then
> there is a folder named "edx-dl".
> run this:
>
> cd edx-dl
>
> python3 edx-dl.py -u your-account course-url
>
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <
#587?email_source=notifications&email_token=AN5CFAQJCOKRU54V7DTB6H3REB3LXA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMUSJNQ#issuecomment-589898934
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/AN5CFARFJFYWFWCAMKOKVMLREB3LXANCNFSM4KRS4X4Q
>
> .
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#587?email_source=notifications&email_token=AOPC4WFYNMREUGO3VAULFNDRECEF3A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMUUNWY#issuecomment-589907675>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOPC4WBMFCB67KIHPOL234TRECEF3ANCNFSM4KRS4X4Q>
.
--
Atentamente, Walter Milen Ruelas Huanca
|
Couldn't able to download evening after cloning edx-dl and using the above command. Please suggest me where I went wrong C:\Users\Kishore\edx-dl>python edx-dl.py -u [email protected] -x stanford https://lagunita.stanford.edu/courses/DB/SQL/SelfPaced/course/ |
For this course: https://courses.edx.org/courses/course-v1:W3Cx+HTML5.0x+1T2020/course/
When I tried only with python: I got this error (very similar): |
@Silverfoxcome: I can try to download the course and share it, would this help you? |
It would help a lot ToT! Again, thanks a lot! |
Hello everyone, I am new with python, Please help checking my results, I dont got any videos , only folders empty. Result C:\edx-dl-master>python edx-dl.py -u (username) https://courses.edx.org/courses/coursev1:URosarioX+URX01+1T2020/course/ |
Hello everyone i want to download my course on edx.org site but i got empty folder ,it there a solution to fix this problem? |
Facing same issue. I am using version 0.1.13 via Anaconda on a Windows Machine. |
Hello, I was for long time the maintainer of this project, but I have less time available now. If anyone wants to bring fixes (or if there are some unreviewed ones) please ping me and I will do my best to review/merge/release them. Remember this is open source and build by all of us so please some help is needed we need more hands/brains to work on this. |
hello i already fixed my problem i used 1.1.0 edx dll 3.7 python version
im dowload a 3 project now but the problem but the lasting i download its
corrupted file
…On Thu, Jan 21, 2021, 4:37 PM Ismaël Mejía ***@***.***> wrote:
Hello, I was for long time the maintainer of this project, but I have less
time available now. If anyone wants to bring fixes (or if there are some
unreviewed ones) please ping me and I will do my best to
review/merge/release them. Remember this is open source and build by all of
us so please some help is needed we need more hands/brains to work on this.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#587 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJDUU4D3BNFZSAQK5DGFRNTS27RWDANCNFSM4KRS4X4Q>
.
|
🚨Please review the Troubleshooting section
before reporting any issue. Don't forget also to check the current issues to
avoid duplicates.
Subject of the issue
No longer downloads any files, only empty folder structure
Your environment
Steps to reproduce
https://courses.edx.org/courses/course-v1:HKUSTx+ELEC3500.1x+1T2020/course/
Expected behaviour
It should download the entire course
Actual behaviour
Doesn't download any files, only empty folder structure
The text was updated successfully, but these errors were encountered: