Type, "/@say/Your message here." after the end of the URL and hit enter to leave a comment. Type, "/find-search terms here.search" to do a full text (document contents) search of any *sub*-directory. If you want to mirror the entire server contact me first; it'll save you time.

Index of /library/Computing/transformers/

Name  ↓ Size  ↓ Date  ↓ 
Parent directory--
Unigram Algorithm_ Subword Regularization_ Improving Neural Network Translation Models with Multiple Subword Candidates_ a..>321.8 KiB2023-May-13 17:38
Train Short, Test Long_ Attention with Linear Biases Enables Input Length Extrapolation_arxiv2108.12409.pdf741.2 KiB2023-Jun-17 23:34
Ties-Merging_ Resolving Interference When Merging Models_ arxiv2306.01708v2.pdf1.1 MiB2024-Nov-04 23:54
The Transformer Model in Equations_ John Thickstun_ 2023.pdf191.0 KiB2023-Jun-24 02:24
The Poison of Alignment_ arxiv2308.13449.pdf185.3 KiB2023-Aug-30 14:18
The Curse of Recursion_ Training on Generated Data Makes Models Forget_ arxiv2305.17493.pdf2.2 MiB2023-Aug-24 19:28
The case for 4-bit precision_ k-bit Inference Scaling Laws_ arxiv2212.09720.pdf884.7 KiB2023-Aug-28 18:57
Steering Llama 2 via Contrastive Activation Addition_ arxiv2312.06681.pdf27.3 MiB2023-Dec-13 04:54
Stay on topic with Classifier-Free Guidance_ arxiv2306.17806.pdf1.9 MiB2023-Sep-30 04:35
SmoothQuant_ Accurate and Efficient Post-Training Quantization for Large Language Models_ arxiv2211.10438.pdf5.1 MiB2023-Dec-11 23:12
SentencePiece_ A simple and language independent subword tokenizer and detokenizer for Neural Text Processing_ arxiv1808.0..>206.7 KiB2023-May-13 17:44
R ULER_ What’s the Real Context Size of Your_ arxiv2404.06654v2.pdf642.6 KiB2024-Jul-30 02:47
RoFormer_ Enhanced Transformer with Rotary Position Embedding_ arxiv2104.09864v4.pdf572.6 KiB2023-Apr-21 00:35
Photonic Matrix Computing_ From Fundamentals to Applications_ Junwei Cheng_ Hailong Zhou_ Jianji Dong_ Nanomaterials 2021...>3.5 MiB2023-Jul-25 17:13
Mixtral of Experts_ arxiv2401.04088.pdf2.4 MiB2024-Jan-09 03:21
LLaMA_ Open and Efficient Foundation Language Models_ arxiv2302.13971.pdf709.5 KiB2023-May-13 17:45
Llama 2_ Open Foundation and Fine-Tuned Chat Models_ arxiv2307.09288.pdf13.0 MiB2023-Aug-30 23:29
Landmark Attention_ Random-Access Infinite Context Length for Transformers_ arxiv2305.16300.pdf500.2 KiB2023-May-28 17:34
Is Cosine-Similarity of Embeddings Really About Similarity_ arxiv2403.05440.pdf1.6 MiB2024-Mar-12 04:07
How Good Are Low-bit Quantized LLAMA3 Models_ An Empirical Study_ arxiv2404.14047v1.pdf260.0 KiB2024-Apr-26 20:16
GQA_ Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints_ arxiv2305.13245.pdf248.2 KiB2023-Sep-01 22:34
gpt4-maybe-leaked-details-sort-of-again.txt7.9 KiB2023-Jul-11 03:35
GLU Variants Improve Transformer_arxiv2002.05202.pdf106.6 KiB2023-May-02 21:23
Fourier Position Embedding_ Enhancing Attention’s Periodic Extension for Length Generalization_ arxiv2412.17739v1.pdf793.3 KiB2024-Dec-26 05:15
Extending Context Window of Large Language Models via Positional Interpolation_ arxiv2306.15595.pdf733.6 KiB2023-Jun-29 02:06
Exponentially Faster Language Modeling_ arxiv2311.10770.pdf230.5 KiB2023-Nov-27 05:35
Efficient streaming language models with attention sinks_ arxiv2309.17453.pdf11.8 MiB2023-Oct-02 17:46
DeepSeek-R1_ Incentivizing Reasoning Capability in LLMs via Reinforcement Learning_ arxiv2501.12948v1.pdf1.3 MiB2025-Jan-24 01:16
Deep neural networks are robust to weight binarization and other non-linear distortions_ arxiv1606.01981.pdf828.6 KiB2024-Mar-02 16:08
Climbing towards Natural Language Understanding_ On Meaning Form and Understanding in the Age of Data_ Emily M Bender- Ale..>472.2 KiB2023-May-08 03:18
Byte Latent Transformer_ Patches Scale Better Than Tokens_ A Pagnoni_ R Pasunuru_ R Rodriguez_ J Nguyen_ B Muller_ M Li_ C..>2.2 MiB2024-Dec-14 04:20
Are Emergent Abilities of Large Language Models a Mirage_ arxiv2304.15004.pdf1.8 MiB2023-May-07 01:08
An Ultra-Low Energy Internally Analog, Externally Digital Vector-Matrix Multiplier Based on NOR Flash Memory Technology_ M..>1.7 MiB2023-Jul-25 17:17

documents added in the last 7 days

7 days, 31 days
generated at 12:00:32, Wed Apr 16, 2025 UTC

Did you write a response to something in this directory listing? What's the URL?

Terms of Use:

You may not access or use the site superkuh.com if you are not over 90 years of age. If you do not agree then you must leave now.

The US Dept. of Justice has determined that violating a website's terms of service is a felony under CFAA 1030(a)2(c). Absurd, isn't it?