DOC: io.rst description and code inconsistent, plus the description is for deprecated behaviour #60705

wjandrea · 2025-01-12T20:22:00Z

Pandas version checks

I have checked that the issue still exists on the latest versions of the docs on main here

Location of the documentation

https://pandas.pydata.org/docs/dev/user_guide/io.html#reading-html-content

Read in the content of the file from the above URL and pass it to read_html as a string:

In [317]: html_str = """
   .....:          <table>
   .....:              <tr>
   .....:                  <th>A</th>
   .....:                  <th colspan="1">B</th>
   .....:                  <th rowspan="1">C</th>
   .....:              </tr>
   .....:              <tr>
   .....:                  <td>a</td>
   .....:                  <td>b</td>
   .....:                  <td>c</td>
   .....:              </tr>
   .....:          </table>
   .....:      """
   .....: 

In [318]: with open("tmp.html", "w") as f:
   .....:     f.write(html_str)
   .....: 

In [319]: df = pd.read_html("tmp.html")

In [320]: df[0]
Out[320]: 
   A  B  C
0  a  b  c

Documentation problems

Problem 1

The "above URL" is

url = 'https://www.sump.org/notes/request/' # HTTP request reflector

but data from that URL is not what's used in the code.

Problem 2

"pass it to read_html as a string" is not what's being demonstrated in the code.

Problem 3

read_html can take an HTML string, but that behaviour is deprecated, per its docs:

Deprecated since version 2.1.0: Passing html literal strings is deprecated. Wrap literal string/bytes input in io.StringIO/io.BytesIO instead.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: io.rst description and code inconsistent, plus the description is for deprecated behaviour #60705

DOC: io.rst description and code inconsistent, plus the description is for deprecated behaviour #60705

wjandrea commented Jan 12, 2025 •

edited

Loading

DOC: io.rst description and code inconsistent, plus the description is for deprecated behaviour #60705

DOC: io.rst description and code inconsistent, plus the description is for deprecated behaviour #60705

Comments

wjandrea commented Jan 12, 2025 • edited Loading

Pandas version checks

Location of the documentation

Documentation problems

Problem 1

Problem 2

Problem 3

Suggested fix for documentation

wjandrea commented Jan 12, 2025 •

edited

Loading