You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Noticed this when I was using the Pismo powered ‘entry text extraction’ on Feedbin.
>> Pismo['http://hsivonen.iki.fi/accept-charset/'].lede
=> "Accept-Charset Is No More. Now that Firefox 10 has been released, the Accept-Charset HTTP header. During the Firefox 4 development cycle, I noticed that IE and Safari were not sending the Accept-Charset HTTP header in their HTTP requests. This meant that the Web had to work even without browser sending that header."
The first sentence given by Pismo:
Now that Firefox 10 has been released, the Accept-Charset HTTP header.
Comes from the following HTML:
<p>Now that Firefox 10 has been released, <del>none of the major browsers send</del><ins>only Chrome sends</ins> the <code>Accept-Charset</code> HTTP header.</p>
If anything I would have expected Pismo to drop the DEL elements but keep the INS elements like so:
Now that Firefox 10 has been released, only Chrome sends the Accept-Charset HTTP header.
Even the html_body does not return these tags. This means possible important parts of a document can go missing. See this return, edited to only show the first paragraph:
>> Pismo['http://hsivonen.iki.fi/accept-charset/'].html_body
=> "Accept-Charset Is No More<p>Now that Firefox 10 has been released, the <code>Accept-Charset</code> HTTP header.</p>\n\n"
Please support the DEL and INS elements by:
Drop the DEL and its content from lede and body but keep the content of INS in both.
Keep the DEL and INS elements and their content in html_body.
The text was updated successfully, but these errors were encountered:
Noticed this when I was using the Pismo powered ‘entry text extraction’ on Feedbin.
The first sentence given by Pismo:
Comes from the following HTML:
If anything I would have expected Pismo to drop the
DEL
elements but keep theINS
elements like so:Even the
html_body
does not return these tags. This means possible important parts of a document can go missing. See this return, edited to only show the first paragraph:Please support the
DEL
andINS
elements by:DEL
and its content fromlede
andbody
but keep the content ofINS
in both.DEL
andINS
elements and their content inhtml_body
.The text was updated successfully, but these errors were encountered: