htmlcleaner - HTML parser written in Java

Property Value
Distribution Fedora 26
Repository Fedora Updates x86_64
Package name htmlcleaner
Package version 2.2.1
Package release 10.fc26
Package architecture noarch
Package type rpm
Installed size 119.52 KB
Download size 116.38 KB
Official Mirror
HtmlCleaner is open-source HTML parser written in Java. HTML found on Web is
usually dirty, ill-formed and unsuitable for further processing.
For any serious consumption of such documents, it is necessary to first
clean up the mess and bring the order to tags, attributes and ordinary text.
For the given HTML document, HtmlCleaner reorders individual elements and
produces well-formed XML. By default, it follows similar rules that the most
of web browsers use in order to create Document Object Model. However, user
may provide custom tag and rule set for tag filtering and balancing.


Package Version Architecture Repository
htmlcleaner-2.2.1-10.fc26.noarch.rpm 2.2.1 noarch Fedora Updates
htmlcleaner-2.2.1-6.fc21.noarch.rpm 2.2.1 noarch Fedora
htmlcleaner-2.2.1-6.fc21.noarch.rpm 2.2.1 noarch Fedora
htmlcleaner - - -


Name Value
java-headless >= 1.6
javapackages-tools -
mvn(org.jdom:jdom) -


Name Value
htmlcleaner = 2.2.1-10.fc26
mvn(net.sourceforge.htmlcleaner:htmlcleaner) = 2.2.1
mvn(net.sourceforge.htmlcleaner:htmlcleaner:pom:) = 2.2.1


Type URL
Binary Package htmlcleaner-2.2.1-10.fc26.noarch.rpm
Source Package htmlcleaner-2.2.1-10.fc26.src.rpm

Install Howto

Install the htmlcleaner rpm package:

# dnf install htmlcleaner




2017-06-26 - Björn Esser <> - 2.2.1-10
- Fix build on recent Fedora releases
2017-02-10 - Fedora Release Engineering <> - 2.2.1-9
- Rebuilt for
2016-02-03 - Fedora Release Engineering <> - 2.2.1-8
- Rebuilt for
2015-06-17 - Fedora Release Engineering <> - 2.2.1-7
- Rebuilt for
2014-06-07 - Fedora Release Engineering <> - 2.2.1-6
- Rebuilt for
2014-04-08 - Michael Simacek <> - 2.2.1-5
- Remove wagon from extensions
2014-03-28 - Michael Simacek <> - 2.2.1-4
- Use Requires: java-headless rebuild (#1067528)
2013-08-03 - Fedora Release Engineering <> - 2.2.1-3
- Rebuilt for
2013-06-20 - Marcin Dulak <> - 2.2.1-2
- fix bug #973084 comment #11
2013-06-07 - Marcin Dulak <> - 2.2.1-1
- initial release

See Also

Package Description
htmlcleaner-javadoc-2.2.1-10.fc26.noarch.rpm API documentation for htmlcleaner
htop-2.1.0-1.fc26.x86_64.rpm Interactive process viewer
httpd-2.4.33-4.fc26.x86_64.rpm Apache HTTP Server
httpd-devel-2.4.33-4.fc26.x86_64.rpm Development interfaces for the Apache HTTP server
httpd-filesystem-2.4.33-4.fc26.noarch.rpm The basic directory layout for the Apache HTTP server
httpd-manual-2.4.33-4.fc26.noarch.rpm Documentation for the Apache HTTP server
httpd-tools-2.4.33-4.fc26.x86_64.rpm Tools for use with the Apache HTTP Server
httpie-0.9.4-9.fc26.noarch.rpm A Curl-like tool for humans
httrack-3.49.2-1.fc26.i686.rpm Website copier and offline browser
httrack-3.49.2-1.fc26.x86_64.rpm Website copier and offline browser
httrack-devel-3.49.2-1.fc26.i686.rpm Development files for httrack
httrack-devel-3.49.2-1.fc26.x86_64.rpm Development files for httrack
hugin-2017.0.0-1.fc26.x86_64.rpm A panoramic photo stitcher and more
hugin-base-2017.0.0-1.fc26.x86_64.rpm Command-line tools and libraries required by hugin
humanity-icon-theme-0.6.13-1.fc26.noarch.rpm Humanity icon theme