Benne Holwerda's Research Blog

Wednesday, February 5, 2025

AI as an citation equalizer?

At the American Astronomical Society meeting in Washington DC last month (January was loooooong subjectively this year), I sat in a fascinating talk given by Karteyk Iyer. He and his collaborators developed a clustering map of the astrophysics literature. Quite literally the current landscape of the field.

The landscape of astronomical literature. You can go stand on a mountain of previous work or wade into the shallows of inter-disciplinary topics.

This maps out where lots (enough?) has been said already and where more could be done.

To show this in a little more scientific heat map:

The distribution of topics in astronomy as mapped onto a density map. It is in effect a giant clustering algorithm of vecorized papers.

What made me perk up about this was the option to ask a question or put in a phrase and get a list of papers that talk about that subject. This is partly LLM based and therefore can interpret plain language.

This hopefully circumvents the problem that certain issues had different nomenclature. For example “dark matter” (Vera Rubin’s and earlier Zwicky’s phrase) was talked about as “non-stellar mass-to-light ratio” by the radio astronomers. In effect the same thing. A sufficiently trained LLM can perhaps circumvent that.

You can try this here:

https://huggingface.co/spaces/kiyer/pathfinder

On the one hand this makes me very excited. Putting in a phrase as part of my paper followed by a bunch of citations always elicited a worry that I was overlooking someone’s work unfairly. Simply because my memory of who did what is fallible at best and completely out to lunch at worst. I’m not alone in this, part of a referee’s job is to suggest more references that are relevant to the subject at hand. Better but still very fallible. I have tried some other tools (use both google scholar and ADS for example, connectedpapers was another).

And it has been shown that for example women are cited less in science (Nature: https://www.nature.com/articles/s41586-022-04966-w). I can imagine nothing more nefarious than poor human memory is to blame. I can also imagine other reasons.

So I was excited at the possibility of at least improving this issue of fair citations. One can put in the phrase that preceded the citations and see if more references pop up. Maybe to jog your memory.

This would be virtuous (give proper credit) but also make it easy to do (easy virtue….wait…).

The list often includes false positives I’ve found while playing around with it. It’s ranking of most relevant works certainly did not align with my personal one (that is also very subjective, which paper convinced you has something to do with which one you encountered first).

There is also no way to include feedback. And with many AI/ML applications, this is all dependent on the ingested training sample. A lot of science has been done before 1995 for example but may not be part of the ADS data-base. Has every journal been included? Some poorer academics only publish their results/catalogs on astro-ph because that is all they can afford (preprints not included right now), what was in a textbook etc etc.

But this shiny new AI tool may help me improve my citation practice. It might help yours as well.

Tuesday, January 28, 2025

More Machine Learning with WALLABY

Title: WALLABY Pilot Survey: kNN identification of perturbed galaxies through HI morphometrics

Link

My paper on the level or perturbed looking galaxies in a couple of WALLABY (Widefield ASKAP L-band Legacy All-sky Blind surveY). The reason that I wanted to see how well this worked on HI data and using the morphometrics that I have come to rely on to parameterize HI morphology.

What made it possible at all was a paper that classified sources in two of those fields into different levels and types of perturbed (Lin+ 2023). Looking at the plot below, that seems like a reasonable sized sample to perhaps try some simpler machine learning out on, building on the work I had done in 2011 on a variety of HI surveys and in 2023 on the WALLABY pilot data.

I simplified the label into simply perturbed-looking and “not” since this is not the biggest training set used in ML.

But these galaxies were closer than those in 2023 and there was some reason to think it might work a little better.

Let’s find out!

There is a pretty good spread in values and maybe some separation in this parameter space. Excellent material to try some (simple) classifiers out on. The one I picked was kNN since it is conceptually easy, classify as a few nearest neighbors in the n-dimensional space. The space does not need to be orthogonal, which the morphometric space definitely isn’t. And there is only one thing to tune: the number of neighbors.

Checking for the optima number of neighbors. I did to the feature engineering (picking which parameters to feed it) already. This was partially motivated by my experiences with some of them (Smoothness is not good, Intensity also needs a smoothing kernel).

As we can see from the plot above, the optimal number of neighbors is 2, after which all the metrics diverge and degrade. Oka 2 neighbors (a bit low maybe?) it is!

Because the training set is still pretty small, I did the train-test loop a couple of times to get an idea of the average performance. We get the confusion matrix below:

The average performance of the test sample using a series of train/test of the kNN.

Not terrible. Not amazing either.

If we use the full training sample to train (and no separate testing) we get:

What do I get if I use the whole labeled data set for training. You have to worry about over-fitting but OTOH this is still a small training sample.

A little better. We really could do with a bigger training sample but that is a refrain in ML. Okay which ones are predicted to be perturbed? It’s these. If you compare with the ones above, it’s a fair first cut.

kNN predictions for which galaxies are perturbed. In individual cases it works…sort of. As a fraction, it works very well.

And we had a second field where there were more galaxies (it’s wider) so we could apply our kNN classifier there too:

Predicted kNN pertured galaxies in the NGC 5044 field.

Now we have a precision for when this field is studied in more detail for signs of perturbed galaxies.

What we have found is that the kNN classifier is pretty decent to get the fraction of galaxies that is perturbed in a given field. For individual galaxies however it is best to think of this as a prediction. With an accuracy of 80%, meaning 1.5 times it still gets it wrong. The good news is that this is pretty easy to beat, with both better training sets, and perhaps a direct classification from HI maps to a perturbed label, rather than going through morphometrics first. One could use perhaps the first order map (the volocity map) instead of the column density one to train a convolutional neural network. But for now we have a prediction of NGC 5055 galaxies and some more intuition how to apply machine learning on HI maps morphology.

Monday, January 6, 2025

Two Books on a chaotic Cosmos

I recently finished “Our Accidental Universe” by Prof. Chris Lintott and just before that I re-read “The Disordered Cosmos” by Prof. Chandra Prescott-Weinstein. Both are books for the general audience to explain the nature of our Universe told by some of the best explainers in the business. They are as far apart in style as I can tell as possible. I like reading books this way, in contrasting pairs. Both are a series of essays/chapters on topics in Astronomy highlighting the often randomness of our Universe and serendipity in our discoveries.

The Disordered Cosmos (DC) originated from blog posts and this shows some in the writing and language. There are footnotes explaining terms or author’s asides but no extended reference list.

Our accidental universe (AU) is in that sense a much more “traditional” popular Astronomy book, footnotes for jokes and asides, with a long list of where the author got his information from for these stories.

The bigger differences between these two books are how much you meet the author personally. In AU, you meet Chris as younger self briefly to highlight discovery or to set a timeframe. There is a person behind the stories and jokey asides but the author keeps his privacy. This is very different for the DC. Here we meet the author personally and writing about intensely personal things and the various identities she brings to the science of Astronomy/cosmology. It is a much more emotional read. When I read it for the first time in the spring of 2019, a lot of the frustrations with the system of physics resonated with me. My experiences as an immigrant white dude here are nowhere near as bad as some of those described in the DC. But the feeling on being judged on emotional labor for students or feeling of not belonging thanks to physics group dynamics. Yeah that hit pretty solidly mid-tenure. I thought that spring semester in 2019 was the roughest (“oh my sweet summer child” is the phrase I think).

So I will be honest, the second half of the DC did not stick in my brain at all. Too stressed. So I really wanted to reread it. When I am overwhelmed by a 3 class semester where one is a new class and assorted chaos, yeah I need to read space kablooie, not the Disordered Cosmos.

The AU is a very smooth read. I also know how Chris sounds so I heard it in his voice and frankly he just sounds so much like the BBC. Stephen Fry encountered this when people in prison were confident he would go to university. Purely because of his diction. I’ve heard a British friend describe it as “plummy”. It is a much more relaxing read, often because I heard part of the stories too.

So when we read books about Astronomy or our Universe, you might expect a soothing story where the narrator has a nice accent in your head. But if you want a glimpse how this sometimes goes down in the heads of people doing the work, the DC is a much more direct and honest look in to the very human and flawed endeavor that is the science of Astronomy.

The fun part is that both books have a similar takeaway message: understanding the cosmos is fun. Be it as we are given a tour by a BBC voice or shown it as an act of resistance against the worse human behaviours. Exploring the Universe is chaotic, random and above all fun.

Friday, November 22, 2024

Stacking; last refuge of…

Searching for HI around MHONGOOSE Galaxies via Spectral Stacking

[astroph]

Veronese et al.

The circum-galactic medium is under intense scrutiny recently because of the role it is suspected to play in sustaining star-formation and the metal (anything heavier than Helium) content of galaxies.

There are a bunch of different approaches to the question of the balance within a galaxy: one can map various species in the galaxy using spectroscopy, one can look for absorption features in ultraviolet spectra (see Tumlinson+ 2017) and one can look for gas directly. The absorption route is popular in the US, the direct observation of cold gas is more of a European/South African thing. It depends on which telescope you have most likely access to.

This paper uses a simulation — TNG50 — to estimate how to stack the 21cm HI atomic hydrogen line signal around nearby galaxies. There is an extensive argument on how to account for any motion before stacking. Here is the map of where one can expect HI (in TNG50 again):

The 21cm HI atomic hydrogen map of two TNG galaxies.

And this is the motion of that gas in the same two galaxies:

The velocity map of the two TNG galaxies. Both are rotating and given the side of the galaxy it is on, that rotation is a reasonable prior for any stacking of the small stuff on the outskirts which is probably not directly detected in observations.

So the models show how one could stack using the motion of the galaxy itself as a prior: there is a receding and approaching side and one can extrapolate how much redshift correction one should apply before stacking the spectra into a single spectrum.

A TNG galaxy with identified areas for stacking.

A stacked spectrum. One can classify stacked spectra as either visually visible in the cube or not even visible as a signal in the cube, only in the stacked spectrum.

So equipped with this approach, the authors now stack where practical in the 18 MHONGOOSE galaxies already observed for this survey. This is not the full sample but it is about half, a fair number.

This is the stacked spectrum detections and their properties. Both visible and non-visible stacks work out fine. The issue is that there are not as many easily stacked images. There is not that much signal outside these disks!

The line properties of stacked areas either visually identified in the cube or stacked without a hint in the cube.

This is bad news for deep HI surveys. The hope was that the cold gas refueling nearby galaxies could be mapped with 21cm and further examined. As it stands, deeper observations will not reveal that much more 21cm line signal in ever more diffuse disks; at those column densities, the gas is ionized.

This remains to be seen but I think it makes a good case that just hammering away blindly to get deeper 21cm obervations on single galaxies is maybe not the best approach. Spreading the observing time around a sample is a much better use of the time. It really validates the MHONGOOSE choice of depth and trade-off between depth and sample size.

More to come of course. How much gas galaxies are receiving from their IGM remains a big unknown.

Friday, November 15, 2024

Dusty Galaxies

If there is one galaxy story from JWST that has caught the news, it’s all the big galaxies at high redshift, forcing a rethink on how galaxies form in the very early Universe. If there a second story from JWST, it is how many galaxies have lots and lots of dust in them apparently in the epochs right after. This was not completely a surprise, the Madau plot already had the option that the lead up to cosmic noon (z=1–2), a lot of the star-formation is hidden by dust. But with JWST, we are really getting a handle on how much of it is hidden.

For example, dust is apparent in early galaxies. This is the point on D. Burgarella’s recent paper [astroph]. Or this recent paper by Tarasse [astroph]. And this week a paper came out on the hidden star-formation in galaxies at 2.5<z<3.5. Cheng et al [astroph]:

Unveiling the Dark Side of UV/Optical Bright Galaxies: Optically Thick Dust Absorption

This uses the CEERS observations of galaxies at this redshift to explore how many are dust-dominated and hiding star-formation.

Quick primer: what does a certain amount of reddening look like in the JWST filters of CEERS at z=3? Note that there is an optically thick regime.

One can select objects that are candidate/not-a-candidate based on their SEDs and compare against the color-color diagram (the original way to select different galaxy populations) and the stellar-mass vs star-formation plot (with the *shudder* “main sequence”)

The candidate and not-a-candidate optically dark galaxies with rest-fram optical colors and the stellar mass and star-formation plot with the MS, the “star-forming galaxy main sequence”, a poor choice or term but we’re stuck with it now.

So how do the authors select candidates, by identifying galaxies with an excess in the longerst wavelength filter (F444W) compared to the SED. Below is an example SED with the MIRI contribution highlighted.

The interesting point here is the F444W just short of that. If there is more flux than expected, that is an indication there is radiation being reprocessed by dust inside this galaxy.

An example SED of a z=2.5 galaxy in CEERS

One of the sanity checks is to see if we are not looking at a disk edge-on. Easy to select for opaque disks in that case.

Now the results show that there is a sizeable fraction of star-formation that is obscured, not just dimmed, by star-formation at z>2. This is something we see in the local Universe, some 40% of disks is not transparent.

You can see that as well in the slope of the blue colors, beta.

And just to remind myself which way beta goes again, here is the slopes for aging (therefore redder) populations from Qin+ 2022 [astroph]: negative means bluer. positive means redder. The obscured structures redden on average. This is the age/reddening degeneracy that plagues SED fitting.

Two examples from Qin+ 2022. Either a burst of star-formation ages ago or constant star-formation over the last 500 million years. As populations age, their blue/ultraviolet slope \beta becomes redder (positive).

Long story short: we see more and more obscured star-formation in galaxies; either in dust-obscured dwarfs galaxies, or part of galaxies (this paper), or even completely optically thick “dark galaxies” not detected with Hubble but now observed with JWST. The early Universe is a lot more dusty place than previously thought. Should be interesting to look at in the coming years.

Tuesday, November 5, 2024

Do you know what you’re Machine Learning has been up to?

Okay I have been tentatively dipping my curiosity into machine learning. I’m coming from a bit behind the curve because well… it’s got a learning curve and I have a day job. But I’ve been steady picking things up. But mostly, I have been noting where there are opportunities to check Machine Learning’s work. Maybe later a deeper survey came along and we can check ML predictions using traditional astronomy techniques we understand much better.

And that is where I have been putting some of my recent research. A little while ago I checked the XSAGA catalog, a prediction for objects that are below z<0.03 against the deeper GAMA spectroscopic sample. The overlap is much smaller than the XSAGA sample but it gives us a direct measure as to how well this ML technique did (well done John Wu, it was spot on where you predicted the effectiveness was). More about it in this paper in MNRAS (or astroph here).

So that was fun. But I wanted to talk about a second paper where I did a check of a ML prediction against more traditional astronomy. In this case a Galaxy Zoo comparison.

The Galaxy Zoo Catalogs for the Galaxy And Mass Assembly (GAMA) Survey

[astroph]

This started as a much more modest idea: we have two Galaxy Zoo catalogs on the equatorial GAMA fields, let’s compare the voting and make these catalogs public so they can be used by students. My main motivation was to generate something that could be used as a reference for students using the GAMA catalogs and make them easy to use by adding CATAID column to them.

The three GAMA fields that seemed to be included in DESI, for which there is a GZ catalog from Walmsley+ (2023).

All well and good and I started comparing voting fractions across both the catalog originally made for the GAMA collaboration, using KiDS imaging and voting from the Galaxy Zoo citizen scientists.

But on closer inspection, the Walmsley+ (2023, astroph) catalog only includes voting fractions! And those are from the ZooBot machine learning algorithm that is trained on early voting and then helps with predictions for the rest of the survey. This is a necessary step as voting on the full survey would take too long. So instead of an A/B test of surveys (KiDS imaging vs DESI), it also became a comparison of Galaxy Zoo voting from people and voting according to people+ZooBot!

Fortunately, the questions had remained the same between the KiDS Galaxy Zoo effort and the DESI Galaxy Zoo (+ZooBot) effort. Well mostly. The question tree looked like this:

The Galaxy Zoo question tree. All volunteers start at the top with Question T00.

The main difference is in the very first question. The DESI voting (and the ZooBot trained on those) seem to vote more for smooth galaxies, than one with features. Why would that be?

Galaxy Zoo voting for KiDS (x-axis) and the fraction in favor of ``smooth galaxy’’ in the DESI survey, including ZooBot predictions.

The difference is depth: the DESI survey is much shallower than KiDS, by design, to get more of the sky. But that means dim features surrounding a bulge, e.g. a disk with spiral arms, will not be as readily visible.

But once DESI Galaxy Zoo (+ ZooBot) can detect a galaxy with features, the voting agrees with each other! How tightly wound are the spiral arms?

Or how many are there?

Sure there is some variance. That was to be expected. But as long as the volunteers (and ZooBot!) identify features, the follow-up questions agree well enough.

That brought me to the next question, can you use the voting predictions from ZooBot for the shallower DESI data to get a similar result. This was of interest to me since I have had a few undergraduate students work on the KiDS Galaxy Zoo catalog with some intresting results:

The Loneliest Galaxies in the Universe: A GAMA and GalaxyZoo Study on Void Galaxy Morphology

Lori Porter [astroph]

Galaxy And Mass Assembly: galaxy morphology in the green valley, prominent rings, and looser spiral arms

Dominic Smith [astroph]

Galaxy And Mass Assembly: Galaxy Zoo spiral arms and star formation rates

Ren Porter-Temple [astroph]

And that last one seemed like a good cross-check. Can we get Ren’s result but using the DESI Galaxy Zoo voting fractions?

Doing the same but the number of spiral arms from the DESI ZooBot catalog from Holwerda+ 2024.

The distributions of stellar mass for a given number of spiral arms in the KiDS Galaxy Zoo voting from Porter-Temple+ (2022).

And yes. The results are qualitatively the same. The statistics are a little worse because the DESI does not have as much voting on number of spiral arms as the deeper KiDS (because of the first question difference). But there is a pretty clear rise in stellar mass with the number of spiral arms. And the main conclusion, that the specific star-formation goes down is also recovered: the specific star-formation drops slightly with the number of arms.

The specific star-formation of galaxies with 1,2,3, and 4 spiral arms.

For the GAMA fields, this may not be relevant since the statistics in Porter-Temple (2022) were much better. BUT you can redo the experiment perhaps with DESI at scale.

Checking ML work remains critical in my opinion. You can’t just shrug and accept black box results. If there are opportunities to cross-check with a different data-set, I think that is perhaps unexciting but critical science.

Final conclusion: ZooBot works!

Friday, October 25, 2024

Revisiting Rubin’s Galaxy; fuel supply

I already talked about Rubin’s Galaxy, the largest disk galaxy in the Local Universe before. It remains a really fun galaxy to study in detail with other instruments. And that is what my collaborators on this (longstanding) project recently did:

A multiwavelength overview of the giant spiral UGC 2885

Carvalho+

arXiv:2410.16467

The first figure, just to establish that indeed Rubin’s Galaxy is a frikkin giant.

This was a good paper to collate all the stats on this galaxy in one spot. This will make it easier to refer to when we will study the gas supply and star-formation rate of this galaxy in the rest of the paper. The underlying idea is that it may point to how low surface brightness galaxies are perhaps a different mode of star-formation in galaxies.

First new observations: SITELLE. This is a unique instrument in that it is an IFU but works interferometrically. Short bandwidth (wavelength range) but the longer you observe, the higher your spectral resolution becomes. Wild!

The SITELLE mapping of Rubin’s Galaxy in three narrow wavelength ranges.

Second instrument is CO measurements using IRAM. This is to establish the molecular hydrogen reservoir. We already have HI observations of this galaxy.

The IRAM observations also give a rotation map.

Blue and red map but not about elections. Galaxy rotation observed!

HI and IRAM observations combined: all the gas fit to form stars.

The SITELLE observations gave us a — spatially resolved — map of the metallicity of Rubin’s Galaxy. It is mostly metal-poor.

So between the gas map and the current star-formation rate across the disk. The SFR was measured from the WISE fluxes.

So at the present consumption rate, how long will it be before Rubin’s Galaxy runs out of fuel?

The stellar mass and star-formation rate of galaxies and Rubin’s Galaxy, color-coded by depletion time of the H2 (molecular gas) supply inferred from IRAM observations. Over 10 billion years!

Rubin’s galaxy has over 10 Billion years of fuel left in the tank! Well over a Hubble time, the current age of the Universe!!