Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Simulate treesitter manually (in an arbitrary buffer region)#35855

PinnedUnanswered
daniilrozanov asked this question inQ&A
Discussion options

EDIT: issue#35907

Suppose there is a read-only buffer and I manually placing some text into it via nvim lua api. This text has a structure of list, where each element consists of some elements, despite it's just plain text. For example its a list of paragraphs where each one has heading and content.

I want to be able to fold, conceal and highlight some parts of this text despite there is no treesitter parser for this, and it is even impossible to create such a parser. The example of how it might look is neogit plugin, where it can fold-unfold git diff without a parser.

So. Can I manually point out to neovim which parts of text are "treesitter" nodes, create AST by hands so I can treat to my text as parsed by treesitter? Probably using some internal neovim functions.

I've tried to figure out can I do it with extmarks, but I didn't found about it in docs.

You must be logged in to vote

Replies: 4 comments 3 replies

Comment options

Can I manually point out to neovim which parts of text are "treesitter" nodes, create AST by hands so I can treat to my text as parsed by treesitter?

There isget_string_parser which can parse an arbitrary string. But you want to activate it for a particular region of a buffer. I'm not aware of an "ergonomic" way to do that, but I wonder if it makes sense to define the "outer" parts of the region as some sort of "null grammar", so that you can treat the region as an "injected" language.

cc@clason@vanaigr

You must be logged in to vote
1 reply
@daniilrozanov
Comment options

After some thought, I think the idea of a "null grammar" follows from what I mentionedearlier, since this is a valid case.

Comment options

@justinmk, I did not accurately describe the problem. I need to turn the entire buffer into nodes, not an arbitrary part. The problem is that there is not and will not be a parser for my buffer, but when generating text for the buffer, I could manually set these nodes so that I could then treat the buffer text as if it were some kind of language (folding, highlighting).

I looked at how it is implemented in neogit. There, the author implemented these functions completely independently of the treesitter mechanisms, so he duplicated all the logic for working with folding and highlighting (this function for example, as well as entire file looks like handmade treesitter), and generally wrote his own "class" for the node and the recursive traversalfunction. It seems to me that this is unnecessary, and if I could simulate the behavior of a parser, I could use the existing treesitter logic.

PS. My idea is to make a plugin forge.nvim which would relate to neogit as forge (emacs package) relates to magit. So PR's or issues should be presented as list, with concealable descriptions, and respective highlights for assigners, labels, statuses, etc. Obviously there is no parser for that, and I don't think it is possible to do since there is just arbitrary text. But while I generating, for example, issue "page", I have issue title, issue author, conversation and other stuff, so I could mark all that elements as treesitter nodes and then just write somehighlight.scm for it (worth noting that issue's conversation unit is a markdown text, so it definitely should be an injected grammar). Exposing internal treesitter API will be useful for that, though I'm not sure what is the exact way of doing that properly. If this is already possible, let me know, and if not, could it be a feature request?

You must be logged in to vote
0 replies
Comment options

Since tree sitter doesn't expose API for creating nodes, you would need to reimplement some of it in lua. It shouldn't be a lot of work though since the buffer is not modifiable.

I've written some code that implements the required minimum for treesitter highlighting to work with a custom AST. It's hacky and also requires 1 change to neovim's files, but you can try to implement other features, and then make a PR exposing the required APIs.

image
Details
localts=require('vim.treesitter')localhl=require('vim.treesitter.highlighter')locallg=require('vim.treesitter.language')localcustomNs=vim.api.nvim_create_namespace('Custom namespace')localbufnr=vim.api.nvim_get_current_buf()locallines= {'Some issue','','Description description','description','','','Another issue','','Description description','description 2',}vim.api.nvim_buf_set_lines(bufnr,0,-1,true,lines)vim.api.nvim_set_hl(0,'@title.custom_lang', {bold=true })vim.api.nvim_set_hl(0,'@description.custom_lang', {italic=true })localNodeNode= {_debug='custom node',range=function(self,bytes)ifbytesthenreturnself._range[1],self._range[2],vim.api.nvim_buf_get_offset(self._tree._bufnr,self._range[1])+self._range[2],self._range[3],self._range[4],vim.api.nvim_buf_get_offset(self._tree._bufnr,self._range[3])+self._range[4]elsereturnunpack(self._range)endend,id=function(self)returntostring(self)-- table 0x<some number>, probably unique? Don't know about unchangingend,iter_children=function(self)localindex=0returnfunction()index=index+1ifindex>#self._childrenthenreturnelsereturnself._children[index]endendend,named=function()returntrueend,missing=function()returnfalseend,type=function(self)returnself._typeend,tree=function(self)returnself._treeend,}Node.__index=Nodelocalfunctionnew_node(tree,type,range,children)returnsetmetatable({_type=type,_tree=tree,_range=range,_children=children },Node)endlocalTSTreeTSTree= {_debug='custom tstreee',root=function(self)returnself._rootend,included_ranges=function(self)return { {self:root():range(true) } }end,}TSTree.__index=TSTreelocalfunctionnew_tstree(bufnr)localself=setmetatable({_bufnr=bufnr },TSTree)self._root=new_node(self,'root', {0,0,10,0 }, {new_node(self,'issue', {0,0,4,0 }, {new_node(self,'title', {0,0,1,0 }, {}),new_node(self,'description', {2,0,4,0 }, {}),    }),new_node(self,'issue', {6,0,10,0 }, {new_node(self,'title', {6,0,7,0 }, {}),new_node(self,'description', {8,0,10,0 }, {}),    }),  })returnselfend-- TODO: can probably use the Neovim's LangTree by creating it manually with the right fieldslocalLangTree= {__debug='custom lang tree',named_node_for_range=function(self)-- TODOreturnself._tree._rootend,lang=function()return'custom_lang'end,register_cbs=function()end,parse=function(self)return {self._tree }end,for_each_tree=function(self,cb)cb(self._tree,self)end,children=function()return {}end,source=function(self)returnself._bufnrend,}LangTree.__index=LangTreelocalcustomLangTree=setmetatable({_bufnr=bufnr,_tree=new_tstree(bufnr) },LangTree)vim.api.nvim_set_option_value('ft','custom_lang', {buf=bufnr })vim.api.nvim_buf_attach(bufnr,false,  {--on_bytes = function(...) return customLangTree:on_bytes(...) end,on_detach=function(...)localparsers=ts.non_existent_get_parsers()ifparsers[bufnr]==customLangTreethenparsers[bufnr]=nilendcustomLangTree:on_detach(...)end,--on_reload = function(...) return customLangTree:on_reload(...) end,  })localcustomQuery= {__debug='custom query',inspect=function()return {captures= {'issue','title','description'},patterns= {},    }end,}-- function() return parsers end, but in runtime/lua/vim/treesitter.luats.non_existent_get_parsers()[bufnr]=customLangTreelocalorigAdd=lg.addlg.add=function()returntrueendlocalorigParseQuery=vim._ts_parse_queryvim._ts_parse_query=function()returncustomQueryendlocalorigCreateQueryCursor=vim._create_ts_querycursorvim._create_ts_querycursor=function(node,query, ...)ifquery==customQuerythenlocalmatches= {}localmatch_id=0localfunctionadd_captures(current_node)ifcurrent_node:type()=='issue'thenlocalcurrent_match_id=match_idmatch_id=match_id+1table.insert(matches, {1,current_node,          {info=function()returncurrent_match_id,-999end },        })elseifcurrent_node:type()=='title'thenlocalcurrent_match_id=match_idmatch_id=match_id+1table.insert(matches, {2,current_node,          {info=function()returncurrent_match_id,-999end },        })elseifcurrent_node:type()=='description'thenlocalcurrent_match_id=match_idmatch_id=match_id+1table.insert(matches, {3,current_node,          {info=function()returncurrent_match_id,-999end },        })endforchildincurrent_node:iter_children()doadd_captures(child)endendadd_captures(node)localresult= {__debug='custom tsquerycursor',__matches=matches,__matchI=0,next_capture=function(self)self.__matchI=self.__matchI+1localmatch=self.__matches[self.__matchI]ifmatchthenreturnunpack(match)elsereturnnilendend,next_match=function()end,    }returnresultelsereturnorigCreateQueryCursor(node,query,...)endendhl.new(customLangTree, {queries= {    [customLangTree:lang()]='do not\0parse'  }})lg.add=origAddvim._ts_parse_query=origParseQuery
You must be logged in to vote
2 replies
@daniilrozanov
Comment options

Thanks a lot!! This is the exact what I needed. I think I'll work on PR

@daniilrozanov
Comment options

After some research I see 2 ways of doing that.

1

First is similar to what@vanaigr suggested, like create TSNode and TSTree lua type, (re)define all the logic implementedhere, and then integrate new types to existing mechanisms.

Problem is it's necessary to reimplement all existing treesitter's C logic (regarding trees and nodes). It doesn't take much effort if we're going to provide the minimum of features that treesitter has. But the idea of duplicating some types and logic are not ideal, as for me.

2

The second way seems more accurate to me. We need to trick treesitter'sts_parser_parse function. While i was looking at treesitter's sources and docs i mentioned that there could be a customization. According to itsexample. What if I'll make someconst TSLanguage *tree_sitter_dummy_lang(/*user's tree*/); function that returns statefulTSLanguage*, which state is a custom user's grammar tree, so that language makesTSParser act just like translator from custom user's tree to TSTree.

What's problematic about that is to create such atree_sitter_dummy_lang, unless it will be a PR to treesitter (I could try). Then one can write like

// Use it in neovim C core// non existent treesitters functionts_parser_set_pseudo_language(parser,tree_sitter_dummy_lang());

and be happy. Then add something like this to lua api:

node=require('vim.treesitter.customnode')(--[[node's type, visibility, pos, children, etc]])-- custom nodetree=require('vim.treesitter.customtree')(node)-- custom treevim.treesitter.language.add('custom_lang', {--[[new option]]tree=tree})

So the benefit is we are able to use all treesitter's features and almost zero changes totreesitter.c, though we need to forbid some modifying functions on custom tree


Although it should be borne in mind that I have only superficially figured out how neovim C core works and its interaction with treesitter, so I may be mistaken or missing something. Any thoughts?

Comment options

I started to work on that feature. Will try to send PR anytime soon

You must be logged in to vote
0 replies
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Category
Q&A
Labels
3 participants
@daniilrozanov@justinmk@vanaigr

[8]ページ先頭

©2009-2025 Movatter.jp