Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
This repository was archived by the owner on Sep 28, 2023. It is now read-only.
/quadtreePublic archive

Quadtree - gradient-boosted decision tree model used to predict guanine quadruplexes in DNA sequences

License

NotificationsYou must be signed in to change notification settings

patrikkaura/quadtree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


package_versionpython_versionnode_js_versionreactnextjs

The Quadtree is a gradient-boosted decision tree model used to predict guanine quadruplexes in DNA sequences. It's developed on top of the LightGBM python library. Each sequence base is encoded based on a given encoding prescription. The model was trained to be used with a sliding window and analyses the whole sequence. Machine learning model can be used as python script or thru preview websitequadtree.vercel.app

Repository structure

quadtree    └─ web -> preview website source code    └─ python          └─ model -> lightgbm model params          └─ train -> example files how training was performed          └─ quadtree.py -> predictor

Requirements

  • lightgbm==3.3.2
  • numpy==1.21.2

Install dependencies

Before using install the requirements:

  pip install -r requirements.txt

Usage

Create model instance

fromquadtreeimportQuadtreemodel=Quadtree()

Run analysis - algorithm inputs

  • sequence as a string (maximum length is not limited)
  • threshold (recommended values is 0.2)
  • quadnet model file path
result=quadtree.analyse(sequence='ATTAATACTTTTAACAATTGTAGTATATAAAAAAGGGAGTAACC...',model_path='/path/to/quadnet_model.txt',',score_threshold=0.1)

Results are then returned in given form which can be loaded into pandas DataFrame.

importpandasaspddf=pd.DataFrame(result)
indexpositionsequencelength
00907GCAACAATGGCTGATCCAGAAGGTACAGACGGGGAGGGCACGGGTTGTAACGGCTGGTTTTATGTACAAGCTATTGTAGACAAAAAAACAGGAGATGTAATATCA105
111184GAGGCAGCACAGAAAACAGTCCATTAGGGGAGCGGCTGGAGGTGGATACAGAGTTAAGTCCACGGTTACAAGAAATATCTTTAAATAGTGGGCAGA96
221389ATGTAGTGGCGGCAGTACGGAGGCTATAGACAACGGGGGCACAGAGGGCAACAACAGCAGTGTAGACGGTACAAGTGACAATAGCAATATAGAAAATGTAAATCCAC107
331635AGATTGGGTTACAGCTATATTTGGAGTAAACCCAACAATAGCAGAAGGATTTAAAACACTAATACAGCCATTTAT75
442229AATAGATGAAGGGGGAGATTGGAGACCAATAGTGCAATTCCTGCGATACCAACAAATAGAGTTTATAACATTTTTAG77

Model scheme

LAYOUT_LEFT_RIGHT Quadtree

Training parameters

These parameter were used to train lightgbm model

LGBM Classifiervalue
colsample bytree0.817574864502621
learning rate0.03744835808549148
max bin127
min child sample3
number of estimators1000
number of leaves74
regularization alpha0.0033803043003857677
regularization lambda0.7013136087939289
objectivebinary

Authors

License

This project is licensed under the MIT License - see theLICENSE file for details. # quadtree

About

Quadtree - gradient-boosted decision tree model used to predict guanine quadruplexes in DNA sequences

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp