He's a good adventure dude.
591 stories

Autoregressive long-context music generation with Perceiver AR

1 Share

We present our work on music generation with Perceiver AR, an autoregressive architecture that is able to generate high-quality samples as long as 65k tokens—the equivalent of minutes of music, or entire pieces!

🎵Music Samples 📝ICML Paper GitHub Code DeepMind Blog

The playlist above contains samples generated by a Perceiver AR model trained on 10,000 hours of symbolic piano music (and synthesized with Fluidsynth).


Transformer-based architectures have been recently used to generate outputs from various modalities—text, images, music—in an autoregressive fashion. However, their compute requirements scale poorly with the input size, which makes modeling very long sequences computationally infeasible. This severely limits models’ abilities in settings where long-range context is useful for capturing domain-specific properties. Music domains offer a perfect testbed, since they often exhibit long-term dependencies, repeating sequences and overall coherence over entire minutes—all necessary ingredients for producing realistic samples that are pleasing to the human ear!

Transformer vs. Perceiver AR

To ameliorate these issues, we propose Perceiver AR, an autoregressive version of the original Perceiver architecture. A Perceiver model maps the input to a fixed-size latent space, where all further processing takes place. This enables scaling up to inputs of over 100k tokens! Perceiver AR builds on the initial Perceiver architecture by adding causal masking. This allows us to autoregressively generate music samples of high quality and end-to-end consistency, additionally achieving state-of-the-art performance on the MAESTRO dataset.


Perceiver AR model architecture

Perceiver AR first maps the inputs (in the diagram, [P,e,r,c,e,i,v,e,r,A,R]) to a fixed-size latent array, via a single cross-attention operation. These latents (3 illustrated above) then interact in a deep stack of self-attention layers to produce estimates for each target. The most recent inputs ([r,A,R]) correspond to queries, and each latent corresponds to a different target position ({1: A, 2: R, 3: <EOS>}).

Causal masking is used in both kinds of attention operations, to maintain end-to-end autoregressive ordering. Each latent can therefore only attend to (a) itself and (b) latents corresponding to ‘earlier’ information (either input tokens or target positions). This respects the standard autoregressive formulation, where the probability distribution for the t-th output is only conditioned on what was generated at previous timesteps 1, ..., t-1.

In the music domain, we use up to 65k-token inputs, which corresponds to several minutes in the symbolic domain and one minute in the raw audio domain.

Symbolic music

The playlist at the top showcases 8 unconditional samples. These were generated by a model that was trained on 10,000 hours of transcribed YouTube piano performances containing examples between 1k and 32k tokens in length. The model had 1024 latents and 24 self-attention layers. Training on this large-scale dataset yields high-quality samples with stylistic and structural coherence—one can identify repeating musical themes, different chord progressions, arpeggios and even ritardandos. Moreover, the main difference from our previous model trained on YouTube piano performances is that a 32k input size was feasible this time, so we only used full-length pieces for training! This allowed Perceiver AR to better model entire pieces with beginning, middle and end sections.

Next, we present audio samples from the symbolic domain, obtained by training on MAESTRO v3. The input representation in both cases was computed from MIDI files as described by Huang et al. in Section A.2, and the final outputs were synthesized using Fluidsynth.

Raw audio

Perceiver AR can also be used to generate samples from the raw audio domain. Here, we applied the SoundStream codec to MAESTRO v3 .wav files to encode the raw audio. After training the model, we generated samples and decoded them into the source domain. Keeping the context length fixed, we experimented with 3 different codec bitrates—12kbps, 18kbps, 22kbps—which, for an input length of 65k tokens, span 54.4s, 36.8s and 29.6s of music, respectively. The examples below illustrate the trade-off between sample duration and fidelity: codecs with lower bitrates model coarser structure and enable training on a longer period of time, but sacrifice audio quality.

12kbps 18kbps 22kbps

You can listen to more raw audio samples here 🎵.


To end on a high note (🙃), we invite you to enjoy Charlie Chen’s creation - a music box that plays Perceiver AR outputs, adding an immensely nostalgic feel to the generated music!

  title={General-purpose, long-context autoregressive modeling with Perceiver AR},
  author={Hawthorne, Curtis and Jaegle, Andrew and Cangea, C{\u{a}}t{\u{a}}lina and Borgeaud, Sebastian and Nash, Charlie and Malinowski, Mateusz and Dieleman, Sander and Vinyals, Oriol and Botvinick, Matthew and Simon, Ian and others},
  booktitle={The Thirty-ninth International Conference on Machine Learning},
Read the whole story
176 days ago
Share this story

Coinbase Leads Users Astray By Recommending Everything Besides Bitcoin

1 Comment

Coinbase capitalizes on the altcoin craze to profit off users. Their “Top 10 Picks'' omits bitcoin and everything else on the list has performed poorly.

The below is a direct excerpt of Marty's Bent Issue #1212: “Save a friend, tell them to get out of the Coinbase casino. Sign up for the newsletter here.


You'll often hear “Bitcoin maximalists” derided for being anti-free market when cautioning newcomers to stay away from altcoins and the exchanges that push them. Those snake oil salesmen who hiss at Bitcoiners often say that they are simply afraid of competition and don't want to admit that “Bitcoin has stagnated” and “the devs have gone elsewhere.” In reality, many Bitcoiners warn newcomers to stay away from shitcoins and the casinos that list them for trading because they have seen hoards of people led to slaughter by the siren calls of opportunists who care not about human freedom, sound money or decentralization, but being able to make as much money as possible. No matter how unethically it is acquired.

I highly recommend you freaks — especially any of you who have fallen prey to the siren calls of “a better Bitcoin” — to read through this thread from Sam Callahan, which dives into the overtly predatory tactics of Coinbase and their penchant for listing pre-mined altcoins that are utter trash and get auto-dumped on an unsuspecting retail market. Not only that, but Coinbase tends to hide bitcoin deep in the app so their customers overlook it or simply never find it. They are much more incentivized to siphon off fees from shitcoin trading than actually educating individuals about bitcoin and helping them acquire as much as possible.

I would call it a shame, but it's really worse than that. It's quite disgusting actually and Coinbase and its backers should be utterly ashamed of themselves for engaging in this type of bucket shop activity. A once somewhat respectable brand has completely turned itself into a contemptible bad actor that should be avoided at all costs.

Save yourself, your family and friends. Get your bitcoin off Coinbase and advise your network to do the same.

Read the whole story
201 days ago
That's not what a company I own shares in to do, that's dumb and gross.
Share this story

How to break up with Spotify | Violet Blue on Patreon

1 Comment and 4 Shares

Here's a handy privacy-forward guide to ditching Spotify after the company formalized its commitment to Team Pandemic.

Spotify, the music streaming service you’ve called home has doubled-down on enriching and amplifying Joe Rogan. At the same time, the number of Americans dead from Covid-19 is equivalent to the entire population of my hometown, San Francisco (878,000+).

Here’s how to quit Spotify.

First, shop for your new home. There are many to choose from that will accommodate streaming, new releases, discovery, and playlists, including: Tidal, iTunes, Amazon Music, Google Play Music, deezer, Soundcloud, qobuz, Napster, Pandora, Yandex, kkbox, last.fm, and more.

You’ll want to pick a service that meets your needs for convenience, meaning that it’s easy to use on the devices you use for music, like your phone, tablet, computer, or a Sonos system, xbox or PlayStation, etc. You’ll also want to pick a service that has a good security reputation and has a privacy policy that doesn’t look like it was copy/pasted from Facebook.

You may end up choosing “the devil you know.” Meaning, you may already have an account at Amazon or Apple so you know they already have all your scarily-private deets, but they also have a massive security budget, and their apps are on everything you use. You may be sick of Big Tech corporations ripping off artists like crazy, and want to go with an alt music service — if so, be sure to take five minutes to Google (app name + hacked, privacy, or breach) to make sure your choice isn’t sketchy.

I went with Tidal, but FYI: I don’t know anyone there and am not affiliated in any way. I simply saw friends moving there and liked what I saw when I investigated and started signing up. I like that their highest tier gives more money to artists, and that they offer discounts for students, veterans, and — especially important with this decision in our pandemic times — they give 40% off to first responders. Suck on that, Joe Spotify.

Pick your poison and sign up. Do not sign in with Facebook, Apple, Amazon, or any company you don’t want spying on your music streaming data (this data includes everything, from your location to what devices you use, can include voice input, streaming history, podcast interactivity, search queries, saved/favorited items, followers and who you follow, payment information, and more).

Go directly into Settings and rifle through Privacy, Security, and/or anything that’s public-facing, like if your real name and gender is shown, or if people can see if you play that one song over and over when you’re sad. Make sure all ad tracking or third-party data sharing is off, no debate. Adjust all the settings.

Next, go into your Spotify settings and click Apps (Apps with access to your Spotify information). Click “Remove Access” to everything. This means you’re signed out on everything.

Check your new service for import functions — you didn’t need to do this before picking a service (I’ll explain that next) but if there’s a playlist/music import function, process or app your new service prefers, then use that.

Otherwise, use what I did, which is the poorly-named service TunemyMusic. Their specialty is doing exactly what we’re doing: moving — or rather, cloning — our music and playlists from one service to another. You don’t need to sign up. Remember, signing up (even if you get “free” access) means you hand over way more than you can control, from app use habits and personal info to location and whatever else they can grab.

Click “Let’s start.” It’ll say “Select the source” and Spotify is first, quite convenient.

Then it will ask you to log in to Spotify and show you a big list of things that it gets permissions to do. Many of those mean that it can reach in and copy over your music. Some are things we don’t want it to ever do, which is why immediately after the transfer we’re going into spotify/Apps to “Remove Access” — even though we’re deleting Spotify. Trust me on this.

Then you see all the stuff you can decide to transfer over or not. 

Update: when I published this post the free/no-signup version of TunemyMusic limited you to 1,000 songs for each import; as of Jan. 30 they cut it in half to 500 songs. I'm guessing a lot of people are using it right now! No problem though, you can just import your playlists in batches (like I did) -- TunemyMusic is a well-made, robust tool, and it's fast. Pick your first batch, do the import, then go back to TunemyMusic’s “Let’s start” for the next group of your playlists.

When you’re done — or if you need to take a break in between imports — go into Spotify’s “Remove Access” and kick out TunemyMusic.

Now make sure your new streaming home works on everything you use for music. It’ll probably work better than Spotify’s crappy app. (FYI if you get stuck with Tidal on gaming consoles or things like Nvidia Shield, you can run it through Plex.)

Once you’re comfy and make sure your new music service works, it’s time to tell Spotify with you think about its role directly prolonging and worsening the pandemic that has ruined our lives. Tell them how covid misinformation and fake, dangerous ‘treatments’ — what Joe Rogan and Spotify push — have affected your family, friends, and your future. 

Cancel the membership and close the account entirely. This means:

- Go to Account > Plans, and cancel your current plan. It will ask you if you're sure four times, taking you to a new screen each time, just keep clicking "yes I'm sure/cancel."

- Now you need to close the account so you don't remain counted as a user and you'll have formally told the company to stop using, storing, and profiting off your data. Spotify makes you contact a customer service bot on this support page. Click "send message" to start the chat.

The bot will ask you over and over if you're sure, then ask why. It'll ask if you're sure again, then send an email link for you to finalize the closure. Spotify has a 7-day grace period for account re-activation, because greed is clingy.

Delete all instances of the app on every device you own, and on desktop go into your browser settings and delete all cookies and site data. We’re going for scorched earth. (Do the same for TunemyMusic.)

That’s it. Let me know if I missed anything or if you have questions or need advice about privacy and security with any of this. Making a dent in Spotify/Rogan’s conscience is probably out of reach since human suffering is just thought experiments and profit to them, but what’s important here is being able to live with our own choices. We’re the ones who have to “learn to live with it” — namely, all the suffering, long covid, loss of family and friends, and death.


Edit: updated to add Spotify support bot steps.

Read the whole story
313 days ago
Share this story
1 public comment
313 days ago
I personally have been using Apple Music since Rdio shut down, owing to the terrible recommendations engine Spotify had when I was shopping around then. I’ve used iTunes Match since it launched to hold everything in my collection, which is a great way to upgrade any MP3s you still have and it handles any music you have, not just the smaller set licensed to streaming services or even the major stores.
Washington, DC


1 Comment and 2 Shares
Read the whole story
328 days ago
If you like this stuff there's a pile of it at reddit over at r/hermancainawards
Share this story

Saturday Morning Breakfast Cereal - Sometimes

2 Comments and 9 Shares

Click here to go see the bonus panel!

You know what, I'm going to go make cookies.

Today's News:
Read the whole story
461 days ago
Share this story
1 public comment
461 days ago
But if you make the cookies a more sometimes food you increase your chances of you life being a slightly less sometime life.

WTF: Signal Adds Cryptocurrency Support

3 Comments and 10 Shares

According to Wired, Signal is adding support for the cryptocurrency MobileCoin, “a form of digital cash designed to work efficiently on mobile devices while protecting users’ privacy and even their anonymity.”

Moxie Marlinspike, the creator of Signal and CEO of the nonprofit that runs it, describes the new payments feature as an attempt to extend Signal’s privacy protections to payments with the same seamless experience that Signal has offered for encrypted conversations. “There’s a palpable difference in the feeling of what it’s like to communicate over Signal, knowing you’re not being watched or listened to, versus other communication platforms,” Marlinspike told WIRED in an interview. “I would like to get to a world where not only can you feel that when you talk to your therapist over Signal, but also when you pay your therapist for the session over Signal.”

I think this is an incredibly bad idea. It’s not just the bloating of what was a clean secure communications app. It’s not just that blockchain is just plain stupid. It’s not even that Signal is choosing to tie itself to a specific blockchain currency. It’s that adding a cryptocurrency to an end-to-end encrypted app muddies the morality of the product, and invites all sorts of government investigative and regulatory meddling: by the IRS, the SEC, FinCEN, and probably the FBI.

And I see no good reason to do this. Secure communications and secure transactions can be separate apps, even separate apps from the same organization. End-to-end encryption is already at risk. Signal is the best app we have out there. Combining it with a cryptocurrency means that the whole system dies if any part dies.

Read the whole story
604 days ago
This is stupid and it will die quickly. There's far too much friction on so many levels. Again, a mature software starts adding things nobody is asking for. Hopefully whomever green lit this will feel bad and any more stupid ideas will be slow in coming.
608 days ago
Share this story
1 public comment
611 days ago
"adding a cryptocurrency to an end-to-end encrypted app muddies the morality of the product, and invites all sorts of government investigative and regulatory meddling: by the IRS, the SEC, FinCEN, and probably the FBI."

"Signal is the best app we have out there. Combining it with a cryptocurrency means that the whole system dies if any part dies"
Earth, Sol system, Western spiral arm
Next Page of Stories