Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Performant and powerful keyword matching, extraction and annotation for the DOM

NotificationsYou must be signed in to change notification settings

AlexJeffcott/adorn

Repository files navigation

Performant and powerful keyword matching, extraction and annotation for the DOM

Background and Inspiration

Want keyword matching for large amounts of text (with large numbers of keywords) in Javascript-friendly environments?

Want to leverage knowledge resources to enrich a website's content without the overhead of curating and maintaining the extra content?

Want to add interactive elements (whether simply styling, links, or tooltips or something more complex) based on a user's profile or settings?

Adorn your content nimbly, but with power and precision.

Performance

Thecomplexity of theFlashText algorithm is linear and can search or replace keywords in one pass over a document. The time complexity of this algorithm is not dependent on the number of terms being searched or replaced. So for a document of size N (characters) and a dictionary of M keywords, the time complexity will be O(n). It is much faster than Regex, because regex time complexity is O(MxN).

TheAho–Corasick algorithm, by contrast, is linear in the length of the strings plus the length of the searched text plus the number of output matches.

Power

There are several methods available depending on what you wish to achieve including, a simple list of match ids, a list of 'dirty' matches, the full text returned with matches wrapped in a custom tag.

It is ready to be used in static html and plain text, in Node and in the Browser. There is also support for simple implementation in React.

It is possible to have case-insensitive AND case-sensitive matches in the same Match instance. In other words, the flower 'rose' is not matched when looking for the name 'Rose' but both 'flower' and 'Flower' can be matched in the same matcher instance unlike the original flash-text implementation.

You can optionally listen for DOM mutations and scrolling.

One possible disadvantage of this algorithm is that it does not match substrings and returns the longest match only (if you want to do that then aho-corasick is probably a better choice for you).

Some examples

While the below examples showcase htmlWrapping via annotateDOM with acustom element, there is no reason not to take a simpler approach and add a class, or inline styles, or even an anchor link with an href calculated from the matched keyword.

Checkout the React Typescript codesandbox

Usage in static HTML as a Javascript plugin

<htmllang="en"><head><style>x-annotate {border-bottom-color: cadetblue;border-bottom-style: solid;border-bottom-width:2px;        }</style></head><body><p>    Lorem ipsum dolor sit amet, consectetur adipiscing elit. In cursus cursus enim eu    scelerisque. Nam eleifend purus sed quam facilisis aliquet. Fusce feugiat neque elit, non    egestas ipsum molestie quis. Suspendisse quis ipsum malesuada, scelerisque tellus quis,    auctor tortor. Nam gravida dolor at molestie facilisis. Donec faucibus nisl vitae ante    accumsan, id vulputate lorem convallis. Integer condimentum nunc turpis, eget pellentesque    nunc gravida nec. Maecenas in tincidunt eros. Nullam ac feugiat turpis. Interdum et    malesuada fames ac ante ipsum primis in faucibus. Nullam at posuere urna. Phasellus    fermentum dolor nec sapien congue feugiat. Duis aliquam, ex finibus porttitor viverra, quam    augue gravida dui, quis cursus purus justo a mi.</p><scripttype="module">import{TextNodesFromDOM,Match,annotateDOM}from'../annotate/build';constinsensitive=newMap([['123',['Ipsum']],['456',['neque']],['789',['Ut']]]);constsensitive=newMap([['321',['Nullam']]]);constopts={tag:'x-annotate',getAttrs:(id:string)=>[['data-match-id',id]]};constmatch=newMatch(insensitive,sensitive,opts);consttextNodesFromDOM=newTextNodesFromDOM(document.body,[opts.tag.toUpperCase()]);annotateDOM(textNodesFromDOM.walk(document.body),match);textNodesFromDOM.watchDOM((ns)=>annotateDOM(ns,match));textNodesFromDOM.watchScroll((ns)=>annotateDOM(ns,match));</script></body></html>

Simple usage in React

import{FC,useEffect}from'react';import{TextNodesFromDOM,Match,annotateDOM}from'annotate';constinsensitive=newMap([['123',['Ipsum']],['456',['neque']],['789',['Ut']]]);constsensitive=newMap([['321',['Nullam']]]);constopts={tag:'x-annotate',getAttrs:(id:string)=>[['data-match-id',id]]};constmatch=newMatch(ipsumCaseInsensitive,ipsumCaseSensitive,opts);consttextNodesFromDOM=newTextNodesFromDOM(document.body,[opts.tag.toUpperCase()]);constIpsum:FC=()=>{useEffect(()=>{annotateDOM(textNodesFromDOM.walk(document.body),match);constscrollCB=textNodesFromDOM.watchScroll((ns:Node[])=>annotateDOM(ns,match));return()=>textNodesFromDOM.endWatchScroll(scrollCB);},[]);return(<p>      Lorem ipsum dolor sit amet, consectetur adipiscing elit. In cursus cursus enim eu      scelerisque. Nam eleifend purus sed quam facilisis aliquet. Fusce feugiat neque elit, non      egestas ipsum molestie quis. Suspendisse quis ipsum malesuada, scelerisque tellus quis,      auctor tortor. Nam gravida dolor at molestie facilisis. Donec faucibus nisl vitae ante      accumsan, id vulputate lorem convallis. Integer condimentum nunc turpis, eget pellentesque      nunc gravida nec. Maecenas in tincidunt eros. Nullam ac feugiat turpis. Interdum et      malesuada fames ac ante ipsum primis in faucibus. Nullam at posuere urna. Phasellus      fermentum dolor nec sapien congue feugiat. Duis aliquam, ex finibus porttitor viverra, quam      augue gravida dui, quis cursus purus justo a mi.</p>);};

About

Performant and powerful keyword matching, extraction and annotation for the DOM

Topics

Resources

Stars

Watchers

Forks


[8]ページ先頭

©2009-2025 Movatter.jp