item¶
The Instruction Trace Embedding Model (ITEM).
This is a custom BERT-like model built for binary code - it takes advantage of unique data (structural, dataflow) in the binary code domain to improve performance over a naive tranformer implementation.
- class undertale.models.item.TransformerEncoder(depth: int, hidden_dimensions: int, vocab_size: int, input_size: int, heads: int, intermediate_dimensions: int, dropout: float, eps: float)¶
Bases:
Module
- class undertale.models.item.TransformerEncoderForMaskedLM(depth: int, hidden_dimensions: int, vocab_size: int, input_size: int, heads: int, intermediate_dimensions: int, dropout: float, eps: float, lr: float, warmup: float)¶
Bases:
LightningModule,Module
- class undertale.models.item.TransformerEncoderForSequenceSimilarity(*args, **kwargs)¶
Bases:
Module
- class undertale.models.item.TransformerEncoderForSequenceClassification(classes: int, depth: int, hidden_dimensions: int, vocab_size: int, input_size: int, heads: int, intermediate_dimensions: int, dropout: float)¶
Bases:
Module
- class undertale.models.item.TransformerEncoderForSequenceSummarizationGPT2(*args, **kwargs)¶
Bases:
Module
Modules
Finetune a pretrained model on a pairwise contrastive task. |
|
Finetune a pretrained model on a summarization task. |
|
Compute the embedding of some code given a finetuned model. |
|
Predict masked tokens given a pretrained model. |
|
Compute the similarity of two code samples given a finetuned model. |
|
Generate a summary for a piece of code given a finetuned model. |
|
Model implementation. |
|
Pretrain a model on a Dataflow Prediction (DP) task. |
|
Pretrain a model on a Masked Language Modeling (MLM) task. |
|
Tokenizer implementation and training script. |