Type, "/@say/Your message here." after the end of the URL and hit enter to leave a comment. Type, "/find-search terms here.search" to do a full text (document contents) search of any *sub*-directory. If you want to mirror the entire server contact me first; it'll save you time.

Index of /library/Computing/transformers/

Name  ↓ Size  ↓ Date  ↓ 
Parent directory--
An Ultra-Low Energy Internally Analog, Externally Digital Vector-Matrix Multiplier Based on NOR Flash Memory Technology_ M..>1.7 MiB2023-Jul-25 17:17
Are Emergent Abilities of Large Language Models a Mirage_ arxiv2304.15004.pdf1.8 MiB2023-May-07 01:08
Byte Latent Transformer_ Patches Scale Better Than Tokens_ A Pagnoni_ R Pasunuru_ R Rodriguez_ J Nguyen_ B Muller_ M Li_ C..>2.2 MiB2024-Dec-14 04:20
Climbing towards Natural Language Understanding_ On Meaning Form and Understanding in the Age of Data_ Emily M Bender- Ale..>472.2 KiB2023-May-08 03:18
Deep neural networks are robust to weight binarization and other non-linear distortions_ arxiv1606.01981.pdf828.6 KiB2024-Mar-02 16:08
DeepSeek-R1_ Incentivizing Reasoning Capability in LLMs via Reinforcement Learning_ arxiv2501.12948v1.pdf1.3 MiB2025-Jan-24 01:16
Efficient streaming language models with attention sinks_ arxiv2309.17453.pdf11.8 MiB2023-Oct-02 17:46
Exponentially Faster Language Modeling_ arxiv2311.10770.pdf230.5 KiB2023-Nov-27 05:35
Extending Context Window of Large Language Models via Positional Interpolation_ arxiv2306.15595.pdf733.6 KiB2023-Jun-29 02:06
Fourier Position Embedding_ Enhancing Attention’s Periodic Extension for Length Generalization_ arxiv2412.17739v1.pdf793.3 KiB2024-Dec-26 05:15
GLU Variants Improve Transformer_arxiv2002.05202.pdf106.6 KiB2023-May-02 21:23
gpt4-maybe-leaked-details-sort-of-again.txt7.9 KiB2023-Jul-11 03:35
GQA_ Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints_ arxiv2305.13245.pdf248.2 KiB2023-Sep-01 22:34
How Good Are Low-bit Quantized LLAMA3 Models_ An Empirical Study_ arxiv2404.14047v1.pdf260.0 KiB2024-Apr-26 20:16
Is Cosine-Similarity of Embeddings Really About Similarity_ arxiv2403.05440.pdf1.6 MiB2024-Mar-12 04:07
Landmark Attention_ Random-Access Infinite Context Length for Transformers_ arxiv2305.16300.pdf500.2 KiB2023-May-28 17:34
Llama 2_ Open Foundation and Fine-Tuned Chat Models_ arxiv2307.09288.pdf13.0 MiB2023-Aug-30 23:29
LLaMA_ Open and Efficient Foundation Language Models_ arxiv2302.13971.pdf709.5 KiB2023-May-13 17:45
Mixtral of Experts_ arxiv2401.04088.pdf2.4 MiB2024-Jan-09 03:21
Photonic Matrix Computing_ From Fundamentals to Applications_ Junwei Cheng_ Hailong Zhou_ Jianji Dong_ Nanomaterials 2021...>3.5 MiB2023-Jul-25 17:13
RoFormer_ Enhanced Transformer with Rotary Position Embedding_ arxiv2104.09864v4.pdf572.6 KiB2023-Apr-21 00:35
R ULER_ What’s the Real Context Size of Your_ arxiv2404.06654v2.pdf642.6 KiB2024-Jul-30 02:47
SentencePiece_ A simple and language independent subword tokenizer and detokenizer for Neural Text Processing_ arxiv1808.0..>206.7 KiB2023-May-13 17:44
SmoothQuant_ Accurate and Efficient Post-Training Quantization for Large Language Models_ arxiv2211.10438.pdf5.1 MiB2023-Dec-11 23:12
Stay on topic with Classifier-Free Guidance_ arxiv2306.17806.pdf1.9 MiB2023-Sep-30 04:35
Steering Llama 2 via Contrastive Activation Addition_ arxiv2312.06681.pdf27.3 MiB2023-Dec-13 04:54
The case for 4-bit precision_ k-bit Inference Scaling Laws_ arxiv2212.09720.pdf884.7 KiB2023-Aug-28 18:57
The Curse of Recursion_ Training on Generated Data Makes Models Forget_ arxiv2305.17493.pdf2.2 MiB2023-Aug-24 19:28
The Poison of Alignment_ arxiv2308.13449.pdf185.3 KiB2023-Aug-30 14:18
The Transformer Model in Equations_ John Thickstun_ 2023.pdf191.0 KiB2023-Jun-24 02:24
Ties-Merging_ Resolving Interference When Merging Models_ arxiv2306.01708v2.pdf1.1 MiB2024-Nov-04 23:54
Train Short, Test Long_ Attention with Linear Biases Enables Input Length Extrapolation_arxiv2108.12409.pdf741.2 KiB2023-Jun-17 23:34
Unigram Algorithm_ Subword Regularization_ Improving Neural Network Translation Models with Multiple Subword Candidates_ a..>321.8 KiB2023-May-13 17:38

documents added in the last 7 days

7 days, 31 days
generated at 12:00:33, Sat Apr 12, 2025 UTC

Did you write a response to something in this directory listing? What's the URL?

Terms of Use:

You may not access or use the site superkuh.com if you are not over 90 years of age. If you do not agree then you must leave now.

The US Dept. of Justice has determined that violating a website's terms of service is a felony under CFAA 1030(a)2(c). Absurd, isn't it?