Using htmlq to filter web data

Using htmlq to filter web data


Similar to the jq, the htmlq facilitates the filtering of html data. It can be utilized along with the curl command.

To filter with id: article-body

$ curl -s https://dev.to/anks/using-jq-to-filter-json-data-36c5 | htmlq '#article-body'

Enter fullscreen mode

Exit fullscreen mode

This will filter all codeblocks on a specified dev.to page:

$ curl -s https://dev.to/anks/using-jq-to-filter-json-data-36c5 | htmlq '[class="highlight js-code-highlight"]'
Enter fullscreen mode

Exit fullscreen mode

To filter out non-code text from the page:

$ curl -s https://dev.to/anks/using-jq-to-filter-json-data-36c5 | htmlq '#article-body>p'
<p>Basic Elements</p>
<p>n ∉ [0, ∞), int</p>
<p>Ex.</p>
<p>file.json<br>
</p>
<p>To filter ids:<br>
</p>
<p>To return value of <code>name</code> key when id is 1<br>
</p>
<p>To filter ids as json<br>
</p>
<p>Ref. :<br>
<a href="https://stedolan.github.io/jq/">https://stedolan.github.io/jq/</a><br>
<a href="https://programminghistorian.org/en/lessons/json-and-jq">https://programminghistorian.org/en/lessons/json-and-jq</a></p>
Enter fullscreen mode

Exit fullscreen mode

To filter out non-code text from the page and to return the output in text format:

$ curl -s https://dev.to/anks/using-jq-to-filter-json-data-36c5 | htmlq -t '#article-body>p'
Basic Elements
n ∉ [0, ∞), int
Ex.
file.json

To filter ids:

To return value of name key when id is 1

To filter ids as json

Ref. :
https://stedolan.github.io/jq/
https://programminghistorian.org/en/lessons/json-and-jq

Enter fullscreen mode

Exit fullscreen mode

Ref.
https://github.com/mgdm/htmlq



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.