I see no reason to doubt their claims about inactivating viruses on the mask. However, at ~$8 per mask, it would be cheaper to just use one normal n95 per day than to use one of these for 3 days. I expect the antiviral masks will also lose filter efficacy and fit quality with reuse. Also, I don’t think self inoculation is very likely if you’re careful about handling the mask and wash your hands after. So, it’s probably overall safer to use one n95 a day than to reuse an antiviral mask for 3 days.
You’re right. I’ll fix that.
People rarely talk, laugh or scream on public transport, so the risk is much lower compared to somewhere like a bar or hospital. Also, I’m talking about relative contamination levels. Even if you’re only lightly exposed for 20 minutes, the concentration of virus on your mask is probably ~hundreds of times higher than the concentration on your clothes.
Consider the volume of air you breathe in 20 min. 95% of the virus in that air is now on your mask. Compare that to the volume of virus that settles out of the air onto your clothes. Considering COVID can remain in air for hours, that amount is likely much smaller.
I address that in the general comments section. Exhalation valves do make n95s worse at source control compared with n95s without a valve. However an n95 with a valve is still about as good at protecting people around you as a cloth mask or surgical mask.
Thank you for the question. I’ll add my response to my answer.
Keep in mind that the filter surfaces (where the air flows through) of the mask may have spent hours collecting COVID particles from the atmosphere, since you’ve been continuously pulling contaminated air through the mask. The filter surfaces may have thousands of times the level of contamination typically seen in solid surfaces. It’s best to handle potentially contaminated masks by the straps, especially when removing the mask. If you absolutely must touch the mask itself, avoid touching filter surfaces, and instead touch a portion of the mask’s edge that’s away from your mouth and eyes. However, virus can still potentially transfer to your hands, even with proper handling (source). Thus, you should wash your hands after handling a potentially contaminated mask.
Reaerosolization of filtered particles is possible, but seems to only occur to a significant degree when the humidity is low and the particles in question are large and dry (source, source). Virus particles are typically either small and dry or large and wet (when suspended in water droplets), so I don’t think this is the primary concern. I guess avoid inhaling too close to the mask’s outer surface if you’re worried about this.
Thank you for this excellent post. Here are some thoughts I had while reading.
I think there’s another side to the hard paths hypothesis. We are clearly the first technology-using species to evolve on Earth. However, it’s entirely possible that we’re not the first species with human-level intelligence. If a species with human level intelligence but no opposable thumbs evolved millions of years ago, they could have died out without leaving any artifacts we’d recognize as signs of intelligence.
Besides our intelligence, humans seem odd in many ways that could plausibly contribute to developing a technological civilization.
We are pretty long-lived.
We are fairly social.
Feral children raised outside of human culture experience serious and often permanent mental disabilities (Wikipedia).
A species with human-level intelligence, but whose members live mostly independently may not develop technological civilization.
We have very long childhoods.
We have ridiculously high manual dexterity (even compared to other primates).
We live on land.
Most animals are aquatic.
It’s hard to have an industrial revolution when you can’t burn things.
Note that by Wikipedia’s listed estimates for cortical neuron counts, there are multiple dolphin/whale species with higher counts than us.
Given how well-tuned our biology seems for developing civilization, I think it’s plausible that multiple human-level intelligent species arose in Earth’s history, but additional bottlenecks prevented them from developing technological civilization. However, most of these bottlenecks wouldn’t be an issue for an intelligence generated by simulated evolution. E.g., we could intervene in such a simulation to give low-dexterity species other means of manipulating their environment. Perhaps Earth’s evolutionary history actually contains n human-level intelligent species, only one of which developed technology. That implies the true compute required to evolve human-level intelligence is far lower.
I also think the discussion of neuromophic AI and whole brain emulation misses an important possibility that Gwern calls “brain imitation learning”. In essence, you record a bunch of data about human brain activity (using EEG, implanted electrodes, etc.), then you train a deep neural network to model the recorded data (similar to how GPT-3 or BERT model text). The idea is that modeling brain activity will cause the deep network to learn some of the brain’s neurological algorithms. Then, you train the deep network on some downstream task and hope its learned brain algorithms generalize to the task in question.
I think brain imitation learning is pretty likely to work. We’ve repeatedly seen in deep learning that knowledge distillation (training a smaller student model to imitate a larger teacher model) is FAR more computationally efficient than trying to train the student model from scratch, while also giving superior performance (Wikipedia, distilling BERT, distilling CLIP). Admittedly, brain activity data is pretty expensive. However, the project that finally builds human-level AI will plausibly cost billions of dollars in compute for training. If brain imitation learning can cut the price by even 10%, it will be worth hundreds of millions in terms of saved compute costs.
Need: IOS app for continuously recording iPhone sensor data at all times.
Other programs I’ve tried: Toolbox—Smart Meter Tools, Sensors Toolbox—Multitool, phyphox, Physics Toolbox Sensor Suite, Gauges
I’ve tried many apps that let you see sensor data from your iPhone, but, SensorLog is the first that lets you log gigabytes of data in the background continuously for multiple days. Ironically, it’s also one of the smallest apps I’ve used, at just 2.2 MB. My only issue with it is that the average audio dB logs seem to be bugged for long-term recordings.
Broadly speaking, there are five issues to worry about when reusing masks:
Virus particles contaminate the mask’s surface, and may spread to you while handling the mask.
Mask filter surfaces (where the air flows through) may have spent hours collecting COVID particles from the air, since you’ve been continuously pulling contaminated air through the mask. The filter surfaces may have hundreds or thousands of times the contamination seen in solid surfaces.
Reaerosolization of filtered particles (where particles trapped by the mask re-enter the air) is possible, but likely releases negligible amounts of virus (source, source) compared to the ~5% an N95 mask fails to stop.
There are a number of approaches for decontaminating masks. For coronavirus specifically, the simplest approach is to just let the masks sit. The time necessary to inactivate COVID virons depends on the temperature and humidity. Options include: (source)
4 days at 21-23 °C, 40% humidity (However, this source indicates virons may be present after 6 days)
1 hour at 70 °C, any% humidity (using, e.g., an oven)
Boiling water for 5 min (may lose ~8% filtration efficacy, but also cleans mask of dirt)
UV-C radiation can also decontaminate masks. However, this process is potentially unreliable because the UV intensity needed varies with mask material, masks with unusual geometry may shadow portions of the mask from treatment, and dirt or other soilage may block radiation (source). Make sure to use >= 1 J/cm^2 for >=1 minute (source). Don’t use > 10 J/cm^2 to avoid damaging mask structure.
Chemical agents such as ethanol and bleach may reduce mask filtration (source).
Loss of mask structure prevents a good fit to your face.
Generally, it’s hard to properly fit an N95. Among 74 anesthesiologists, 63% of women and 29% of men failed fit testing, even with a fresh respirator (source). Overall failure rates were 43% after 4 days, 50% after 10 days and 55% after 15 days. Additionally, people were very bad at estimating the quality of their fits, with 73% of those who failed the test thinking they had a good fit.
Loss of electrostatic charge worsens filtration efficacy.
N95 masks don’t lose much efficacy if they’re just stored, even for years at a time (source). However, they do eventually lose efficacy if they’re actually used. With 8 hours per day of use, N95 masks retain ~95% efficacy after 3 days, ~92% efficacy after 5 days, and drop to ~80% efficacy after 14 days (source). Note: this refers to just the material’s filtration efficacy, and does not take into account any further reduction due to worsened fit quality.
Mask electrostatic charge degrades more quickly in humid environments (source). Thus, a mask with an exhalation valve will likely last longer. An N95 respirator with exhalation valve is likely as effective at source control (preventing spread from you to others) as a cloth or surgical mask (source), but many establishments (such as airlines) do not allow masks with exhalation valves.
If you want to get fancy, this paper describes a procedure for recharging a mask’s electrostatic potential. However, that won’t help with the loss of structure issue.
Accumulation of filtered particulate makes the mask harder to breathe through and makes inhaled air more likely to pass around the mask rather than through it.
I don’t think this is usually an issue because loss of structure/efficacy will force you to change masks more quickly than the masks get clogged. However, if you’re in a dusty/smokey location, it could be a problem. I suggest changing out a mask as soon as you notice it getting more difficult to breathe. You can also wear a surgical mask over the N95 to protect it from larger contaminants.
Accumulation of sweat/dirt/etc makes the mask disgusting to wear.
I suppose this is up to personal preference.
I’d suggest replacing an N95 mask at least once every 5 days, and preferably once every 3 days. I’d suggest 1 hour at 70 °C for decontamination. Additionally:
I’d recommend using a mask with an exhalation valve, if you can.
I’d recommend storing masks in a low-humidity environment while not using them.
It’s best to handle potentially contaminated masks by the straps, especially during removal. If you absolutely must touch the mask itself, avoid touching filter surfaces or interior, and instead touch the edge of the mask somewhere that’s away from your mouth and eyes.
Virus can still transfer to your hands, even with proper handling (source). You should wash your hands after handling a potentially contaminated mask.
You should never stack potentially contaminated masks. I.e., don’t allow the filter surface of one mask to be in contact with the interior of another.
Wearing a cloth/surgical mask over the N95 will help protect it from splashes and large contaminants. However, this may accelerate the loss of electrostatic charge by increasing the humidity within the mask. Do this if you think your N95 may get spoiled otherwise.
If decontamination is a chore, one option would be to use a rotating set of masks, wearing one a day sequentially until you’re worn them all once, then decontaminate the entire set using an oven.
Additionally, you may want to consider alternatives to N95s. Half-face elastomeric respirators are designed to be reusable, are far more protective than even a properly fitted N95, are much easier to fit properly, and I personally found them much more comfortable than expected. Additionally, they only require replacement filters when breathing becomes difficult, so they cost less in the long run.
Finally, at the highest tier of protection, you can buy powered air purifying respirators for $300 or make your own for $15-30. I don’t have any experience with either option, so I can’t comment much.
There should be a fair bit more than 2 epsilon of leeway in the line of equality. Since the submodules themselves are learned by SGD, they won’t be exactly equal. Most likely, the model will include dropout as well. Thus, the signals sent to the combining function are almost always more different than the limits of numerical precision allow. This mean the combining function will need quite a bit of leeway, otherwise the network’s performance is just zero always.
I think this is plausible, but maybe a bit misleading in terms of real-world implications for AGI power/importance.
Looking at the scaling laws observed for language model pretraining performance vs model size, we see strongly sublinear increases in pretraining performance for linear increases in model size. In figure 3.8 of the GPT-3 paper, we also see that zero/few/many shot transfer learning to SuperGLUE benchmarks also scale sublinearly with model size.
However, the economic usefulness of a system depends on a lot more than just parameter count. Consider that Gorillas have 56% as many cortical neurons as humans (9.1 vs 16.3 billion; see this list), but a human is much more than twice as economically useful as a gorilla. Similarly, a merely human level AGI that was completely dedicated to accomplishing a given goal would likely be far more effective than a human. E.g., see the appendix of this Gwern post (under “On the absence of true fanatics”) for an example of how 100 perfectly dedicated (but otherwise ordinary) fanatics could likely destroy Goldman Sachs, if each were fully willing to dedicate years of hard work and sacrifice their lives to do so.
If gradient hacking is thought to be possible because gradient descent is a highly local optimization process, maybe it would help to use higher-order approaches. E.g., Newton’s method uses second order derivative information, and the Householder methods use even higher order derivatives.
These higher order methods aren’t commonly used in deep learning because of their additional computational expense. However, if such methods can detect and remove mechanisms of gradient hacking that are invisible to gradient descent, it maybe be worthwhile to occasionally use higher order methods in training.
I think it’s plausible we’ll be able to use deep learning to model a brain well before we understand how the brain works.
Record a ton of brain activity + human behaviour with a brain computer interface and wearable recording devises, respectively.
Train a model to predict future brain activity + behaviour, conditioned on past brain activity + behaviour.
Continue running the model by feeding it its own predicted brain activity + behaviour as the conditioning data for future predictions.
Congratulations, you now have an emulated human. No need to understand any brain algorithms. You just need tons of brain + behaviour data and compute.
I think this will be possible before non brain-based AGI because current AI research indicates it’s easier to train a model by distilling/imitating an already trained model than it is to train from scratch, e.g., DistilBERT: https://arxiv.org/abs/1910.01108v4
Do you view art, literature, meditation or pet care similarly?
I thought that if things got significantly more intense I might have a heart attack and die!
I was initially skeptical that this was a risk worth considering. I’ve heard anecdotes of people dying of excitement, but seemed like a “shark attack” sort of risk that’s more discussed than experienced. However, some Googling revealed “Cardiovascular Events during World Cup Soccer”, which finds that cardiac incidents were 2.66x higher on days the German team competed during the 2006 soccer world cup. FIFA’s website says an average of ~21.9 million people watched each match. This website says Germany had a population of 81,472,235 in 2006.
If we attribute 100% of the 2.66x increase to 21.9 million soccer fans being more excited on those days (as opposed to getting less sleep, drinking more alcohol, etc.), then we get (CV_risk_x * 21.9 + 59.57) / 81.47 = 2.66, so CV_risk_x = 7.18x higher risk due to extreme excitement. If we arbitrarily attribute 33% of the increase to excitement, we get (CV_risk_x * 21.9 + 59.57) / 81.47 = 1.548, and CV_risk_x = 3.04x.
That’s higher than I expected, but still not too bad, especially if your current risk is low. I think virtual reality in particular is less of a risk than many other high-excitement activities because it involves more exertion than, say, normal video games or reading. I expect the increased exertion on net more than balances out any excitement risks.
Your link says rats have ~200 million neurons, but I think synapses are a better comparison for NN parameters. After all, both synapses and parameters roughly store how strong the connections between different neurons are.
Using synapse count, these agents are closer to guppies than to rats.
The summary says they use text and a search for “text” in the paper gives this on page 32:
“In these past works, the goal usually consists of the position of the agent or a target observation to reach, however some previous work uses text goals (Colas et al., 2020) for the agent similarly to this work.”
So I thought they provided goals as text. I’ll be disappointed if they don’t. Hopefully, future work will do so (and potentially use pretrained LMs to process the goal texts).
There are people who’ve been blind from birth. They’re still generally intelligent. I think general intelligence is mostly applying powerful models to huge amounts of rich data. Human senses are sufficiently rich even without vision.
Also, there are lots of differences between human brains and current neural nets. E.g., brains are WAY more powerful than current NNs and train for years on huge amounts of incredibly rich sensory data.
What really impressed me were the generalized strategies the agent applied to multiple situations/goals. E.g., “randomly move things around until something works” sounds simple, but learning to contextually apply that strategy
to the appropriate objects,
in scenarios where you don’t have a better idea of what to do, and
immediately stopping when you find something that works
is fairly difficult for deep agents to learn. I think of this work as giving the RL agents a toolbox of strategies that can be flexibly applied to different scenarios.
I suspect that finetuning agents trained in XLand in other physical environments will give good results because the XLand agents already know how to use relatively advanced strategies. Learning to apply the XLand strategies to the new physical environments will probably be easier than starting from scratch in the new environment.
Very impressive results! I’m particularly glad to see the agents incorporating text descriptions of their goals in the agents’ inputs. It’s a step forward in training agents that flexibly follow human instructions.
However, it currently looks like the agents are just using the text instructions as a source of information about how to acquire reward from their explicit reward functions, so this approach won’t produce corrigible agents. Hopefully, we can combine XLand with something like the cooperative inverse reinforcement learning paradigm.
E.g., we could add CIRL agent to the XLand environments whose objective is to assist the standard RL agents. Then we’d have:
An RL agent
whose inputs are the text description of its goal and its RGB vision + other sensors
that gets direct reward signals
A CIRL agent
whose inputs are the text description of the RL agent’s goals and the CIRL agent’s own RGB vision + other sensors
that has to infer the RL agent’s true reward from the RL agent’s behavior
Then, apply XLand open ended training where each RL agent has a variable number of CIRL agents assigned as assistants. Hopefully, we’ll get a CIRL agent that can receive instructions via text and watch the behavior of the agent it’s assisting to further refine its beliefs about its current objective.