No longer downloading anything, only empty folder structure #587

numlockkey · 2020-02-07T18:27:10Z

🚨Please review the Troubleshooting section
before reporting any issue. Don't forget also to check the current issues to
avoid duplicates.

Subject of the issue

No longer downloads any files, only empty folder structure

Your environment

Operating System (name/version): Win 10
Python version: 3.8.1
youtube-dl version: 2020.01.24
edx-dl version: 0.1.11

Steps to reproduce

https://courses.edx.org/courses/course-v1:HKUSTx+ELEC3500.1x+1T2020/course/

Expected behaviour

It should download the entire course

Actual behaviour

Doesn't download any files, only empty folder structure

arsenyspb · 2020-02-08T01:54:46Z

@EugeneLoy might be connected to your research and contributions in #559 before, I'm also getting empty folders on two courses that I started, quite strange.

Coronavirus maybe?

arsenyspb · 2020-02-08T01:58:32Z

Here's mine:

Darwin <host> 19.3.0 Darwin Kernel Version 19.3.0: Thu Jan  9 20:58:23 PST 2020; root:xnu-6153.81.5~1/RELEASE_X86_64 x86_64
arseny@almond:_MITx$ which python
/Users/<user>/anaconda3/bin/python
<user>@<host>:_MITx$ which edx-dl
/Users/<user>/anaconda3/bin/edx-dl
<user>@<host>:_MITx$ edx-dl --version
0.1.11
<user>@<host>:_MITx$ tree -L 4
.

0 directories, 0 files
<user>@<host>:_MITx$ edx-dl -u <user>@<email> https://courses.edx.org/courses/course-v1:MITx+14.310x+1T2020/course/ --debug
root[main] edx_dl version 0.1.11
root[parse_file_formats] file_formats: ['e?ps', 'pdf', 'txt', 'doc', 'xls', 'ppt', 'docx', 'xlsx', 'pptx', 'odt', 'ods', 'odp', 'odg', 'zip', 'rar', 'gz', 'mp3', 'R', 'Rmd', 'ipynb', 'py']
Password:
root[edx_get_headers] Building initial headers for future requests.
root[_get_initial_token] Getting initial CSRF token.
root[_get_initial_token] Found CSRF token.
root[edx_get_headers] Headers built: {'User-Agent': 'edX-downloader/0.01', 'Accept': 'application/json, text/javascript, */*; q=0.01', 'Content-Type': 'application/x-www-form-urlencoded;charset=utf-8', 'Referer': 'https://courses.edx.org/user_api/v1/account/login_session', 'X-Requested-With': 'XMLHttpRequest', 'X-CSRFToken': 'ZZh0LVFSfMEgW4oh5YtmzXb4c7yIgB777ytXRIceZGbhIqvhKilRpm1ulHBkX8at'}
root[edx_login] Logging into Open edX site: https://courses.edx.org/login_ajax
root[get_courses_info] Extracting course information from dashboard.
root[get_courses_info] Data extracted: 
...
<snip>
....
root[get_available_sections] Extracting sections for :https://courses.edx.org/courses/course-v1:MITx+14.310x+1T2020/course/
root[get_available_sections] Extracted sections: [<edx_dl.common.Section object at 0x109dcee48>, <edx_dl.common.Section object at 0x109dcee10>, <edx_dl.common.Section object at 0x109dce438>, <edx_dl.common.Section object at 0x109dcedd8>]
root[_display_selections] Downloading Data Analysis for Social Scientists [course-v1:MITx+14.310x+1T2020/co]
root[_display_sections] Downloading 4 section(s)
root[_display_sections] Section  1: Module 1: Introduction to the Course
root[_display_sections] Section  2: Entrance Survey
root[_display_sections] Section  3: Module 2: Fundamentals of Probability, Random Variables,  Joint Distributions + Collecting Data
root[_display_sections] Section  4: Module 3:  Describing Data, Joint and Conditional Distributions of Random Variables
root[extract_all_units_in_parallel] Extracting all units information in parallel.
root[extract_all_units_in_parallel] urls: []
root[main] Removed 0 duplicated urls from 0 in total
root[download] Output directory: Downloaded
<user>@<host>:_MITx$ tree -L 4
.
└── Downloaded
    └── Data_Analysis_for_Social_Scientists
        ├── 01-Module_1-_Introduction_to_the_Course
        ├── 02-Entrance_Survey
        ├── 03-Module_2-_Fundamentals_of_Probability_Random_Variables__Joint_Distributions__Collecting_Data
        └── 04-Module_3-__Describing_Data_Joint_and_Conditional_Distributions_of_Random_Variables

6 directories, 0 files
<user>@<host>:_MITx$

Basically zero files are downloaded @EugeneLoy help us :-)

MATRIX30 · 2020-02-08T17:17:16Z

Experiencing thesame issue and https://lagunita.stanford.edu/ doesn't work either

frifich · 2020-02-08T22:39:07Z

yes, lagunita is not working. would be a good fix right now since stanford will close access to all its courses by the end of march

Crystyx · 2020-02-08T23:46:41Z

Edx-dl stopped working (it downloads empty folders now) because of a major update being pushed to the edx servers a week ago. Please fix this valuable tool, thank you!

econwalter23 · 2020-02-09T09:25:19Z

have the same problem, please help

dmagliano · 2020-02-10T13:36:01Z

Unfortunately having the same issue.

MostafaWahdan · 2020-02-10T18:02:20Z

Having the same issue

ichit · 2020-02-10T23:08:11Z

not working after login to my courses . It does not download

rbrito · 2020-02-10T23:29:47Z

Just for the record, I don't plan on working on edx-dl in the foreseeable future. New maintainers are more than welcome to jump in, if other members of the project agree with this. Regards, Rogério Brito.

…

On Mon, Feb 10, 2020 at 3:59 PM Crystyx ***@***.***> wrote: February 2020: edx-dl -> DEAD. Let's hope it's revived soon. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#587?email_source=notifications&email_token=AABTZMIHRKSJNO7WH76N5L3RCGPX3A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELJ2IHY#issuecomment-584295455>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABTZMPAXHPT4PCWHZKOCIDRCGPX3ANCNFSM4KRS4X4Q> .

-- Rogério Brito : rbrito@{ime.usp.br,gmail.com} : GPG key 4096R/BCFCAAAA http://cynic.cc/blog/ : github.com/rbrito : profiles.google.com/rbrito DebianQA: http://qa.debian.org/developer.php?login=rbrito%40ime.usp.br

Canas · 2020-02-11T04:06:12Z

yes, lagunita is not working. would be a good fix right now since stanford will close access to all its courses by the end of march

I wanted to do this too and managed to fix it for Stanford at least.

Go to parsing.py
At line 425, make Stanford use the NewEdXPageExtractor() instead of CurrentEdXPageExtractor()
Replace NewEdXPageExtractor with the following implementation

class NewEdXPageExtractor(CurrentEdXPageExtractor):
    """
    A new page extractor for the latest changes in layout of edx
    """

    def extract_sections_from_html(self, page, BASE_URL):
        """
        Extract sections (Section->SubSection) from the html page
        """
        def _make_url(section_soup):  # FIXME: Extract from here and test
            try:
                return None
            except AttributeError:
                # Section might be empty and contain no links
                return None

        def _get_section_name(section_soup):  # FIXME: Extract from here and test
            try:
                return section_soup.div.h3.string.strip()
            except AttributeError:
                return None

        def _make_subsections(section_soup):
            try:
                subsections_soup = section_soup.find_all('li', class_=['subsection'])
            except AttributeError:
                return []
            # FIXME correct extraction of subsection.name (unicode)
            subsections = [SubSection(position=i,
                                      url=s.a['href'],
                                      name=s.a.div.span.string.strip())
                           for i, s in enumerate(subsections_soup, 1)]

            return subsections

        soup = BeautifulSoup(page)
        sections_soup = soup.find_all('li', class_=['outline-item focusable section'])

        sections = [Section(position=i,
                            name=_get_section_name(section_soup),
                            url=_make_url(section_soup),
                            subsections=_make_subsections(section_soup))
                    for i, section_soup in enumerate(sections_soup, 1)]
        # Filter out those sections for which name could not be parsed
        sections = [section for section in sections
                    if section.name]

        return sections

It's just a small modification to some of the code that gets the section elements to get them properly. I've only tested this on Stanford and it works.

dmagliano · 2020-02-11T13:06:11Z

Tested on Amazon AWS DynamoDB course unfortunately still not working. Sad, such a nice tool. Thanks, @rbrito for the time and effort, hope someone can keep it going.

bi1yeu · 2020-02-12T06:11:30Z

I was having this issue with edX (courses.edx.org). The following change seems to have fixed it, but I don't have time to make a PR presently:

--- a/edx_dl/parsing.py
+++ b/edx_dl/parsing.py
@@ -382,13 +382,13 @@ class NewEdXPageExtractor(CurrentEdXPageExtractor):

         def _make_subsections(section_soup):
             try:
-                subsections_soup = section_soup.select("li.vertical.outline-item.focusable")
+                subsections_soup = section_soup.select("li.subsection")
             except AttributeError:
                 return []
             # FIXME correct extraction of subsection.name (unicode)
             subsections = [SubSection(position=i,
                                       url=s.a['href'],
-                                      name=s.a.div.div.string.strip())
+                                      name=s.a.h4.string.strip())
                            for i, s in enumerate(subsections_soup, 1)]

simonbogh · 2020-02-12T10:03:32Z

I was having this issue with edX (courses.edx.org). The following change seems to have fixed it, but I don't have time to make a PR presently:

--- a/edx_dl/parsing.py
+++ b/edx_dl/parsing.py
@@ -382,13 +382,13 @@ class NewEdXPageExtractor(CurrentEdXPageExtractor):

         def _make_subsections(section_soup):
             try:
-                subsections_soup = section_soup.select("li.vertical.outline-item.focusable")
+                subsections_soup = section_soup.select("li.subsection")
             except AttributeError:
                 return []
             # FIXME correct extraction of subsection.name (unicode)
             subsections = [SubSection(position=i,
                                       url=s.a['href'],
-                                      name=s.a.div.div.string.strip())
+                                      name=s.a.h4.string.strip())
                            for i, s in enumerate(subsections_soup, 1)]

Just to confirm, this worked for me as well, thank you @bi1yeu.

DarthVi · 2020-02-12T11:02:37Z

I was having this issue with edX (courses.edx.org). The following change seems to have fixed it, but I don't have time to make a PR presently:

--- a/edx_dl/parsing.py
+++ b/edx_dl/parsing.py
@@ -382,13 +382,13 @@ class NewEdXPageExtractor(CurrentEdXPageExtractor):

         def _make_subsections(section_soup):
             try:
-                subsections_soup = section_soup.select("li.vertical.outline-item.focusable")
+                subsections_soup = section_soup.select("li.subsection")
             except AttributeError:
                 return []
             # FIXME correct extraction of subsection.name (unicode)
             subsections = [SubSection(position=i,
                                       url=s.a['href'],
-                                      name=s.a.div.div.string.strip())
+                                      name=s.a.h4.string.strip())
                            for i, s in enumerate(subsections_soup, 1)]

It doesn't work for me, I still get empty folder structure from this course https://courses.edx.org/courses/course-v1:NYUx+FCS.OS.1+1T2020/course/

simonbogh · 2020-02-12T11:47:17Z

Was a bit too fast. That change will download the youtube videos for me, but all PDFs are empty/corrupt.

bi1yeu · 2020-02-12T16:45:08Z

Interesting, PDFs for my course properly downloaded. If you're comfortable with python I would suggest dropping a breakpoint in that function and seeing whether the PDF hrefs match what you see in the browser. I am guessing the subsection DOM tags for your course don't exactly match my change in all cases.

…

On Wed, Feb 12, 2020, at 3:47 AM, simonbogh wrote: Was a bit too fast. That change will download the youtube videos for me, but all PDFs are empty/corrupt. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#587?email_source=notifications&email_token=AAJBK2MA7DEYMI7VIAV7WPLRCPOULA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELQPMTI#issuecomment-585168461>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJBK2KGKIL2YW532VPBDNLRCPOULANCNFSM4KRS4X4Q>.

Crystyx · 2020-02-12T17:06:58Z

I've used bi1yeu's change to download courses, i get the videos, subtitles and pdf's, however, for the course

https://courses.edx.org/courses/course-v1:UTAustinX+UT.9.01x+1T2019/course/

2 out of 10 sections are empty ("06-Week_2_Extra_Puzzles' and "08-Week_4-_Dodgeball").

Thanks for the change!

ichit · 2020-02-12T22:58:54Z

Hello,
Please can somebody help me with how to use bi1yeu's change to download courses. I am a bit new on how to apply such changes. I desperately need to get some courses from Edx for my thesis.
Thanks

Crystyx · 2020-02-13T00:12:30Z

Hello,
Please can somebody help me with how to use bi1yeu's change to download courses. I am a bit new on how to apply such changes. I desperately need to get some courses from Edx for my thesis.
Thanks

Search for the parsing.py file on your hdd, it should be in a folder edx-dl (mine it's in D:\ProgramData\Anaconda3\Lib\site-packages\edx_dl).
Do 1) or 2):
1)Replace your parsing.py file with the one found at https://github.com/Crystyx/parsing.py
2)Open your parsing.py file by using a text editor, search for the code bi1yeu it's referring to and replace the lines that appear in red with the lines that appear in green.

Credit goes to bi1yeu.

numlockkey · 2020-02-13T17:48:10Z

You can download edX courses with thisprogram also:
https://www.allavsoft.com/video-downloader-converter.html

Crystyx · 2020-02-13T21:16:27Z

Hello,
Please can somebody help me with how to use bi1yeu's change to download courses. I am a bit new on how to apply such changes. I desperately need to get some courses from Edx for my thesis.
Thanks

Search for the parsing.py file on your hdd, it should be in a folder edx-dl (mine it's in D:\ProgramData\Anaconda3\Lib\site-packages\edx_dl).
Do 1) or 2):
1)Replace your parsing.py file with the one found at https://github.com/Crystyx/parsing.py
2)Open your parsing.py file by using a text editor, search for the code bi1yeu it's referring to and replace the lines that appear in red with the lines that appear in green.
Credit goes to bi1yeu.

if it was working at some stage, is not working now.

What course are you trying to download? I can check it.

iemejia · 2020-02-13T21:32:13Z

Ok guys calm down. I am a co maintainer of this project and I don't want to let the project die but I will need your help since I have not been around for a while. Can you please point me to exact PR numbers that fix the current issues. if fixes are in the form of commens can you please guys take them and I open PRs with them so I can review/merge them (or point me to the exact code).
Thanks!

iemejia · 2020-02-13T21:36:50Z

@balta2ar or @rbrito can you please give me permissions in pypi so I can do new releases (and eventually send me via email any particular instruction to do the releases). My username in pypi is iemejia. Thx!

bi1yeu · 2020-02-14T04:46:07Z

@iemejia Hey Ismaël -- I left a comment with a small change to the parser that seemed to fix the issue for my course. I am hesitant to make a PR for it since, based on subsequent comments from others, I don't believe it fixes the current issue 100%.

balta2ar · 2020-02-14T06:45:29Z

@iemejia I've sent you en email with credentials, please check.

danielx11 · 2020-02-14T07:56:06Z

I am using the modified file: https://github.com/Crystyx/parsing.py

But I still get empty folders by trying to download
https://courses.edx.org/courses/course-v1:USMx+BUMM612+1T2020/course/

iemejia · 2020-02-14T16:23:52Z

Acknowledged @balta2ar info received. I will take a look at the issues during the weekend.

Crystyx · 2020-02-14T16:59:51Z

I am using the modified file: https://github.com/Crystyx/parsing.py

But I still get empty folders by trying to download
https://courses.edx.org/courses/course-v1:USMx+BUMM612+1T2020/course/

I have tried

edx-dl -u [email protected] https://courses.edx.org/courses/course-v1:USMx+BUMM612+1T2020/course/ --prefer-cdn-videos -s --file-formats rar,zip,docx,doc,xls,xlsx,ppt,pptx,ods,odt,pdf,e?ps,txt,odp,odg,gz,xz,html,7z

Course gets downloaded until 04-Module_3, when an youtube-dl related error stops the download. Without the option --prefer-cdn-videos i get the same error.

danielx11 · 2020-02-21T10:54:08Z

hm, are you checking the right folder? because I enrolled in the course you specified (Marketing Management) and I've managed to download the files.

would you provide an cloud-download for me? :)

hhankj2u · 2020-02-21T10:59:33Z

@danielx11 I try to download your course with @bi1yeu 's fix and everything work.
Try to clone edx-dl source code and do manually install.

iemejia · 2020-02-21T11:11:45Z

Guys have not had yet to tackle this, but great to see action going on, PRs are welcome !
Hope to get back to this soon.

danielx11 · 2020-02-21T11:17:56Z

@danielx11 I try to download your course with @bi1yeu 's fix and everything work.
Try to clone edx-dl source code and do manually install.

I appreciate your help. I have already tried that... same issue :(

fix coursera-dl#587

Oshibuki · 2020-02-21T13:14:08Z

hm... I dont get it 😂

do I have to use another command? I use this:
edx-dl -u [USER] [COURSE URL]

you should change folder to source code folder,then run this :
python3 edx-dl.py -u your-account course-url
when you run "edx-dl -u [USER] [COURSE URL]",you just run the wrong binary edition

danielx11 · 2020-02-21T13:19:38Z

hm... I dont get it 😂
do I have to use another command? I use this:
edx-dl -u [USER] [COURSE URL]

you should change folder to source code folder,then run this :
python3 edx-dl.py -u your-account course-url
when you run "edx-dl -u [USER] [COURSE URL]",you just run the wrong binary edition

YES! Thanks a lot. That was the actual problem. Now it works :)
However, I only get two PDFs and several folders called overview with just videos. Different to the empty folders – they were structures and named correctly. But with this file: https://github.com/Crystyx/parsing.py AND the correct way to run the app it works fine.

Just to be clear: I only get the embedded content of the course and not the text on the pages – is that correct? So I will need to copy and paste the text additionally?

Oshibuki · 2020-02-21T14:25:24Z

hm... I dont get it 😂
do I have to use another command? I use this:
edx-dl -u [USER] [COURSE URL]

you should change folder to source code folder,then run this :
python3 edx-dl.py -u your-account course-url
when you run "edx-dl -u [USER] [COURSE URL]",you just run the wrong binary edition

YES! Thanks a lot. That was the actual problem. Now it works :)
However, I only get two PDFs and several folders called overview with just videos. Different to the empty folders – they were structures and named correctly. But with this file: https://github.com/Crystyx/parsing.py AND the correct way to run the app it works fine.

Just to be clear: I only get the embedded content of the course and not the text on the pages – is that correct? So I will need to copy and paste the text additionally?

I don't know……when I download other course, it only could videos,source code zip,pdf files,subtitles.I think download the text on pages is impossible.

crashoverburn · 2020-02-21T18:39:30Z

working now for me with python edx-dl.py

…

________________________________ From: singleDog <[email protected]> Sent: Friday, February 21, 2020 2:25 PM To: coursera-dl/edx-dl <[email protected]> Cc: crashoverburn <[email protected]>; Comment <[email protected]> Subject: Re: [coursera-dl/edx-dl] No longer downloading anything, only empty folder structure (#587) hm... I dont get it 😂 do I have to use another command? I use this: edx-dl -u [USER] [COURSE URL] you should change folder to source code folder,then run this : python3 edx-dl.py -u your-account course-url when you run "edx-dl -u [USER] [COURSE URL]",you just run the wrong binary edition YES! Thanks a lot. That was the actual problem. Now it works :) However, I only get two PDFs and several folders called overview with just videos. Different to the empty folders – they were structures and named correctly. But with this file: https://github.com/Crystyx/parsing.py AND the correct way to run the app it works fine. Just to be clear: I only get the embedded content of the course and not the text on the pages – is that correct? So I will need to copy and paste the text additionally? I don't know……when I download other course, it only could videos,source code zip,pdf files,subtitles.I think download the text on pages is impossible. — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#587?email_source=notifications&email_token=AF2POIBGK2JQBMAMVGMAZATRD7P5NA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMS3UAY#issuecomment-589675011>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AF2POIHMVEEMRN3RCPA7YETRD7P5NANCNFSM4KRS4X4Q>.

dmagliano · 2020-02-21T18:57:29Z

Perfect! It seems to work. I´m on a very slow connection. So still downloading MP4. Type: python edx-dl.py -u [username] [course url] it works!!!

…

On Fri, Feb 21, 2020 at 3:39 PM crashoverburn ***@***.***> wrote: working now for me with python edx-dl.py ________________________________ From: singleDog ***@***.***> Sent: Friday, February 21, 2020 2:25 PM To: coursera-dl/edx-dl ***@***.***> Cc: crashoverburn ***@***.***>; Comment < ***@***.***> Subject: Re: [coursera-dl/edx-dl] No longer downloading anything, only empty folder structure (#587) hm... I dont get it 😂 do I have to use another command? I use this: edx-dl -u [USER] [COURSE URL] you should change folder to source code folder,then run this : python3 edx-dl.py -u your-account course-url when you run "edx-dl -u [USER] [COURSE URL]",you just run the wrong binary edition YES! Thanks a lot. That was the actual problem. Now it works :) However, I only get two PDFs and several folders called overview with just videos. Different to the empty folders – they were structures and named correctly. But with this file: https://github.com/Crystyx/parsing.py AND the correct way to run the app it works fine. Just to be clear: I only get the embedded content of the course and not the text on the pages – is that correct? So I will need to copy and paste the text additionally? I don't know……when I download other course, it only could videos,source code zip,pdf files,subtitles.I think download the text on pages is impossible. — You are receiving this because you commented. Reply to this email directly, view it on GitHub< #587?email_source=notifications&email_token=AF2POIBGK2JQBMAMVGMAZATRD7P5NA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMS3UAY#issuecomment-589675011>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/AF2POIHMVEEMRN3RCPA7YETRD7P5NANCNFSM4KRS4X4Q>. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#587?email_source=notifications&email_token=AN5CFAQGDNHEAU6B2MEEUNLREANWNA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMTVOLQ#issuecomment-589780782>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AN5CFATH36Z5FOFIZ5ZS3WDREANWNANCNFSM4KRS4X4Q> .

fixes #587

Bugfixes: - Fix section layouts (fix #587, #588)

econwalter23 · 2020-02-21T23:39:02Z

sorry I am new at this, what do you exactly mean when you say: CHANGE FOLDER TO SOURCE CODE FOLDER El vie., 21 feb. 2020 a las 15:36, Yuri Bochkarev (<[email protected]>) escribió:

…

Closed #587 <#587> via #588 <#588>. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#587?email_source=notifications&email_token=AOPC4WBHXDR3AE2G2HGHOQLREA3K7A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWZ77P6A#event-3061839864>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q> .

-- Atentamente, Walter Milen Ruelas Huanca

Oshibuki · 2020-02-22T01:09:12Z

抱歉，我是新来的，当您说：将文件夹更改为源代码文件夹时，您的确切意思是2月21日，埃尔维尔。2020年15点36分，尤里·博卡卡列夫（[email protected]）
…
通过＃588 < ＃588 > 关闭＃587 < ＃587 >。—因为评论，您收到此消息。回复此电子邮件直接，查看它在GitHub < ＃587？email_source =通知＆email_token = AOPC4WBHXDR3AE2G2HGHOQLREA3K7A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWZ77P6A＃事件3061839864>，或取消订阅< https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q >。
-Atentamente，Walter Milen Ruelas Huanca

first open cmd (windows) or terminal (linux or mac), you could download souce code with "git clone https://github.com/coursera-dl/edx-dl.git",then there is a folder named "edx-dl".
run this:

cd edx-dl
python3 edx-dl.py -u your-account course-url

dmagliano · 2020-02-22T02:24:26Z

try "python edx-dl.py ... etc etc" if python3 doens´t work. Mine worked with python.

…

On Fri, Feb 21, 2020 at 10:09 PM singleDog ***@***.***> wrote: 抱歉，我是新来的，当您说：将文件夹更改为源代码文件夹时，您的确切意思是2月21日，埃尔维尔。2020年15点36分，尤里·博卡卡列夫（ ***@***.***） … <#m_-268619700754803964_> 通过＃588 <#588> < ＃588 <#588> > 关闭＃587 <#587> < ＃587 <#587> >。—因为评论，您收到此消息。回复此电子邮件直接，查看它在GitHub < ＃587 <#587>？email_source =通知＆email_token = AOPC4WBHXDR3AE2G2HGHOQLREA3K7A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWZ77P6A＃事件3061839864>，或取消订阅< https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q >。 <#588> <#588> <#587> <https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q> -Atentamente，Walter Milen Ruelas Huanca first open cmd (windows) or terminal (linux or mac), you could download souce code with "git clone https://github.com/coursera-dl/edx-dl.git",then there is a folder named "edx-dl". run this: cd edx-dl python3 edx-dl.py -u your-account course-url — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#587?email_source=notifications&email_token=AN5CFAQJCOKRU54V7DTB6H3REB3LXA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMUSJNQ#issuecomment-589898934>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AN5CFARFJFYWFWCAMKOKVMLREB3LXANCNFSM4KRS4X4Q> .

econwalter23 · 2020-02-22T10:13:39Z

thanks a lot, it worked with python El vie., 21 feb. 2020 a las 21:24, Diogo Magliano (<[email protected]>) escribió:

…

try "python edx-dl.py ... etc etc" if python3 doens´t work. Mine worked with python. On Fri, Feb 21, 2020 at 10:09 PM singleDog ***@***.***> wrote: > 抱歉，我是新来的，当您说：将文件夹更改为源代码文件夹时，您的确切意思是2月21日，埃尔维尔。2020年15点36分，尤里·博卡卡列夫（ > ***@***.***） > … <#m_-268619700754803964_> > 通过＃588 <#588> < ＃588 > <#588> > 关闭＃587 > <#587> < ＃587 > <#587> > >。—因为评论，您收到此消息。回复此电子邮件直接，查看它在GitHub < ＃587 > <#587>？email_source > =通知＆email_token = > AOPC4WBHXDR3AE2G2HGHOQLREA3K7A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWZ77P6A＃事件3061839864>，或取消订阅< > > https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q > >。 <#588> > <#588> > <#587> > < https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q > > -Atentamente，Walter Milen Ruelas Huanca > > first open cmd (windows) or terminal (linux or mac), you could download > souce code with "git clone https://github.com/coursera-dl/edx-dl.git ",then > there is a folder named "edx-dl". > run this: > > cd edx-dl > > python3 edx-dl.py -u your-account course-url > > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > < #587?email_source=notifications&email_token=AN5CFAQJCOKRU54V7DTB6H3REB3LXA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMUSJNQ#issuecomment-589898934 >, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AN5CFARFJFYWFWCAMKOKVMLREB3LXANCNFSM4KRS4X4Q > > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#587?email_source=notifications&email_token=AOPC4WFYNMREUGO3VAULFNDRECEF3A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMUUNWY#issuecomment-589907675>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AOPC4WBMFCB67KIHPOL234TRECEF3ANCNFSM4KRS4X4Q> .

-- Atentamente, Walter Milen Ruelas Huanca

KrisCherukuri · 2020-03-07T03:10:19Z

try "python edx-dl.py ... etc etc" if python3 doens´t work. Mine worked with python.
…
On Fri, Feb 21, 2020 at 10:09 PM singleDog @.> wrote: 抱歉，我是新来的，当您说：将文件夹更改为源代码文件夹时，您的确切意思是2月21日，埃尔维尔。2020年15点36分，尤里·博卡卡列夫（ @.） … <#m_-268619700754803964_> 通过＃588 <#588> < ＃588 <#588> > 关闭＃587 <#587> < ＃587 <#587> >。—因为评论，您收到此消息。回复此电子邮件直接，查看它在GitHub < ＃587 <#587>？email_source =通知＆email_token = AOPC4WBHXDR3AE2G2HGHOQLREA3K7A5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWZ77P6A＃事件3061839864>，或取消订阅< https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q >。 <#588> <#588> <#587> https://github.com/notifications/unsubscribe-auth/AOPC4WE2TDNNASQAWPUCM5LREA3K7ANCNFSM4KRS4X4Q -Atentamente，Walter Milen Ruelas Huanca first open cmd (windows) or terminal (linux or mac), you could download souce code with "git clone https://github.com/coursera-dl/edx-dl.git",then there is a folder named "edx-dl". run this: cd edx-dl python3 edx-dl.py -u your-account course-url — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#587?email_source=notifications&email_token=AN5CFAQJCOKRU54V7DTB6H3REB3LXA5CNFSM4KRS4X42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMUSJNQ#issuecomment-589898934>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AN5CFARFJFYWFWCAMKOKVMLREB3LXANCNFSM4KRS4X4Q .

Couldn't able to download evening after cloning edx-dl and using the above command. Please suggest me where I went wrong

C:\Users\Kishore\edx-dl>python edx-dl.py -u [email protected] -x stanford https://lagunita.stanford.edu/courses/DB/SQL/SelfPaced/course/
edx_dl version 0.1.13
Password:
Building initial headers for future requests.
Getting initial CSRF token.
Found CSRF token.
Logging into Open edX site: https://lagunita.stanford.edu/login_ajax
Extracting course information from dashboard.
Traceback (most recent call last):
File "edx-dl.py", line 8, in
edx_dl.main()
File "C:\Users\Kishore\edx-dl\edx_dl\edx_dl.py", line 1028, in main
for selected_course in selected_courses}
File "C:\Users\Kishore\edx-dl\edx_dl\edx_dl.py", line 1028, in
for selected_course in selected_courses}
File "C:\Users\Kishore\edx-dl\edx_dl\edx_dl.py", line 186, in get_available_sections
sections = page_extractor.extract_sections_from_html(page, BASE_URL)
File "C:\Users\Kishore\edx-dl\edx_dl\parsing.py", line 403, in extract_sections_from_html
for i, section_soup in enumerate(sections_soup, 1)]
File "C:\Users\Kishore\edx-dl\edx_dl\parsing.py", line 403, in
for i, section_soup in enumerate(sections_soup, 1)]
File "C:\Users\Kishore\edx-dl\edx_dl\parsing.py", line 392, in _make_subsections
for i, s in enumerate(subsections_soup, 1)]
File "C:\Users\Kishore\edx-dl\edx_dl\parsing.py", line 392, in
for i, s in enumerate(subsections_soup, 1)]
AttributeError: 'NoneType' object has no attribute 'string'

Silverfoxcome · 2020-08-14T17:33:22Z

hm... I dont get it joy
do I have to use another command? I use this:
edx-dl -u [USER] [COURSE URL]

you should change folder to source code folder,then run this :
python3 edx-dl.py -u your-account course-url
when you run "edx-dl -u [USER] [COURSE URL]",you just run the wrong binary edition

For this course: https://courses.edx.org/courses/course-v1:W3Cx+HTML5.0x+1T2020/course/
In linux, and using the parsing.py from Crystyx, I ran from the source folder in that way but I get this error:

Traceback (most recent call last): File "edx_dl.py", line 33, in <module> from ._version import __version__ ModuleNotFoundError: No module named '__main__._version'; '__main__' is not a package

When I tried only with python:
python edx_dl.py -u [USER] [COURSE URL]

I got this error (very similar):
Traceback (most recent call last): File "edx_dl.py", line 33, in <module> from ._version import __version__ ValueError: Attempted relative import in non-package

Crystyx · 2020-08-14T19:03:56Z

@Silverfoxcome: I can try to download the course and share it, would this help you?

Silverfoxcome · 2020-08-14T20:35:13Z

@Silverfoxcome: I can try to download the course and share it, would this help you?

It would help a lot ToT!
Thank you!
If it works for you, please, can you tell me how did you do it? I'm very curious about this issue and if there is a way to download the videos in edx with the new changes they have put.

Again, thanks a lot!

Learnpython-code · 2020-12-08T05:18:53Z

Hello everyone, I am new with python, Please help checking my results, I dont got any videos , only folders empty.

Result

C:\edx-dl-master>python edx-dl.py -u (username) https://courses.edx.org/courses/coursev1:URosarioX+URX01+1T2020/course/
edx_dl version 0.1.13
Password:
Building initial headers for future requests.
Getting initial CSRF token.
Found CSRF token.
Logging into Open edX site: https://courses.edx.org/login_ajax
Extracting course information from dashboard.
Downloading Diseño de sistemas de información gerencial para intranet con Micros
oft Access [course-v1:URosarioX+URX01+1T2020/co]
Downloading 5 section(s)
Section 1: Generalidades
Acerca del curso
Section 2: Microsoft Access y Bases de Datos Relacionales
Conceptos básicos
Planear y crear una BDR
Evaluación
Section 3: Diseño de la interface - Consultas
Visualizar información
Modificar la BDR con consultas de acción
Interacción con otros programas
Evaluación
Section 4: Diseño de la interface - Formularios y macros
Ingresar datos a la BDR
Panel de control personalizado
Evaluación
Section 5: Diseño de la interface - Informes
Informes
Evaluación
Cierre
Extracting all units information in parallel.
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@ddbbb4394e4f4eeab5716
95c19842fc2'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@edcc3663b92546ee9f37d
4868d05ba30'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@7a917180012346c8b7f1d
e5837729bbd'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@fdb672aa18b0485aa6954
19f493a5fd0'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@5b34eb36e50a4db6a9c4c
53e719546cf'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@c78e301110b54cff8a850
0c784e16d09'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@fcd257068abb4f588805d
b3a15e0ba06'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@9205182f4d2b46ec93fd6
ff22d752fa6'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@f9a2c97a613a40169a016
67bb6aca2be'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@30549607116847379bc57
b4419084652'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@09f8ee9e3295491495749
4d87da8a4bc'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@674fda5e810440f190d84
9740e674cae'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@fe847e5e361b47a3a3efd
82f480b2a4e'
Processing 'https://courses.edx.org/courses/course-v1:URosarioX+URX01+1T2020/jum
p_to/block-v1:URosarioX+URX01+1T2020+type@sequential+block@29c2dfb8e8294eed941ee
3b576db59c8'
Removed 0 duplicated urls from 0 in total
Output directory: Downloaded

rangerisrael · 2021-01-11T03:33:54Z

Hello everyone i want to download my course on edx.org site but i got empty folder ,it there a solution to fix this problem?

ahsanfarooqui · 2021-01-20T18:36:59Z

Facing same issue. I am using version 0.1.13 via Anaconda on a Windows Machine.

iemejia · 2021-01-21T08:37:36Z

Hello, I was for long time the maintainer of this project, but I have less time available now. If anyone wants to bring fixes (or if there are some unreviewed ones) please ping me and I will do my best to review/merge/release them. Remember this is open source and build by all of us so please some help is needed we need more hands/brains to work on this.

rangerisrael · 2021-01-21T08:53:44Z

hello i already fixed my problem i used 1.1.0 edx dll 3.7 python version im dowload a 3 project now but the problem but the lasting i download its corrupted file

…

On Thu, Jan 21, 2021, 4:37 PM Ismaël Mejía ***@***.***> wrote: Hello, I was for long time the maintainer of this project, but I have less time available now. If anyone wants to bring fixes (or if there are some unreviewed ones) please ping me and I will do my best to review/merge/release them. Remember this is open source and build by all of us so please some help is needed we need more hands/brains to work on this. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#587 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AJDUU4D3BNFZSAQK5DGFRNTS27RWDANCNFSM4KRS4X4Q> .

This comment has been minimized.

Sign in to view

Oshibuki added a commit to Oshibuki/edx-dl that referenced this issue Feb 21, 2020

Update parsing.py

505e605

fix coursera-dl#587

Oshibuki mentioned this issue Feb 21, 2020

fixes coursera-dl/edx-dl#587 #588

Merged

9 tasks

balta2ar closed this as completed in #588 Feb 21, 2020

balta2ar added a commit that referenced this issue Feb 21, 2020

Merge pull request #588 from tanjiarui15/fix-parsing.py-for-edx.org

25cbdee

fixes #587

balta2ar added a commit that referenced this issue Feb 21, 2020

Bump version (0.1.11 -> 0.1.12)

5353d82

Bugfixes: - Fix section layouts (fix #587, #588)

pintu4india mentioned this issue Feb 21, 2020

Using edx-dl 0.1.12 , Still downloading empty folder structure, Issue #587 not yet resolved.. #589

Closed

millsio mentioned this issue Sep 7, 2020

processing but cannot download the videos #639

Open

DGEs2018 mentioned this issue Oct 17, 2020

HTTP 403 forbidden #631

Open

No longer downloading anything, only empty folder structure #587

No longer downloading anything, only empty folder structure #587

Comments

numlockkey commented Feb 7, 2020 • edited Loading

Subject of the issue

Your environment

Steps to reproduce

Expected behaviour

Actual behaviour

arsenyspb commented Feb 8, 2020

arsenyspb commented Feb 8, 2020 • edited Loading

MATRIX30 commented Feb 8, 2020 • edited Loading

frifich commented Feb 8, 2020

Crystyx commented Feb 8, 2020

econwalter23 commented Feb 9, 2020

dmagliano commented Feb 10, 2020

MostafaWahdan commented Feb 10, 2020

This comment has been minimized.

ichit commented Feb 10, 2020

rbrito commented Feb 10, 2020 via email

Canas commented Feb 11, 2020

dmagliano commented Feb 11, 2020

bi1yeu commented Feb 12, 2020

simonbogh commented Feb 12, 2020

DarthVi commented Feb 12, 2020

simonbogh commented Feb 12, 2020

bi1yeu commented Feb 12, 2020 via email

Crystyx commented Feb 12, 2020

ichit commented Feb 12, 2020

Crystyx commented Feb 13, 2020 • edited Loading

numlockkey commented Feb 13, 2020

Crystyx commented Feb 13, 2020

iemejia commented Feb 13, 2020 • edited Loading

iemejia commented Feb 13, 2020

bi1yeu commented Feb 14, 2020

balta2ar commented Feb 14, 2020

danielx11 commented Feb 14, 2020

iemejia commented Feb 14, 2020

Crystyx commented Feb 14, 2020 • edited Loading

danielx11 commented Feb 21, 2020

hhankj2u commented Feb 21, 2020

iemejia commented Feb 21, 2020

danielx11 commented Feb 21, 2020

Oshibuki commented Feb 21, 2020

danielx11 commented Feb 21, 2020 • edited Loading

Oshibuki commented Feb 21, 2020

crashoverburn commented Feb 21, 2020 via email

dmagliano commented Feb 21, 2020 via email

econwalter23 commented Feb 21, 2020 via email • edited Loading

Oshibuki commented Feb 22, 2020

dmagliano commented Feb 22, 2020 via email

econwalter23 commented Feb 22, 2020 via email

KrisCherukuri commented Mar 7, 2020

Silverfoxcome commented Aug 14, 2020 • edited Loading

Crystyx commented Aug 14, 2020

Silverfoxcome commented Aug 14, 2020

Learnpython-code commented Dec 8, 2020

rangerisrael commented Jan 11, 2021

ahsanfarooqui commented Jan 20, 2021

iemejia commented Jan 21, 2021

rangerisrael commented Jan 21, 2021 via email

numlockkey commented Feb 7, 2020 •

edited

Loading

arsenyspb commented Feb 8, 2020 •

edited

Loading

MATRIX30 commented Feb 8, 2020 •

edited

Loading

Crystyx commented Feb 13, 2020 •

edited

Loading

iemejia commented Feb 13, 2020 •

edited

Loading

Crystyx commented Feb 14, 2020 •

edited

Loading

danielx11 commented Feb 21, 2020 •

edited

Loading

econwalter23 commented Feb 21, 2020 via email •

edited

Loading

Silverfoxcome commented Aug 14, 2020 •

edited

Loading