Wouldn't you just apply PCA and then reg y x.
High dimensional econometrics

I feel like you got no serious answer so far.
100% agree with this statement!
Look, stop looking for "highdimensional econometrics". Just learn econometrics; any good econometrics textbook or course mostly deals with many variables.
Also take a multivariable methods/analysis class. Both will teach you most of what you need to know.I know econometrics. There are some important differences in high dimensions, from what I can tell. Some of Belloni and Chernozhukov's work, for example, applies in the case where you have as many instruments as observations. They have other work on selecting instruments in a datadriven way using the lasso. Neither of these are covered in the econometrics field courses I took.
Separately and also by Chernozhukov and coauthors there is the double machine learning literature. Or work on causal forests associated w/ Athey and Wager.
To answer my own question, this is the kind of thing I was looking for:
https://sites.google.com/site/jeremylhour/courses
With notes here:
https://drive.google.com/file/d/1L_iervUBKj3RsXHLEGOtAFlyHEHpmyT4/view

I feel like you got no serious answer so far.
100% agree with this statement!
Look, stop looking for "highdimensional econometrics". Just learn econometrics; any good econometrics textbook or course mostly deals with many variables.
Also take a multivariable methods/analysis class. Both will teach you most of what you need to know.I know econometrics. There are some important differences in high dimensions, from what I can tell. Some of Belloni and Chernozhukov's work, for example, applies in the case where you have as many instruments as observations. They have other work on selecting instruments in a datadriven way using the lasso. Neither of these are covered in the econometrics field courses I took.
Separately and also by Chernozhukov and coauthors there is the double machine learning literature. Or work on causal forests associated w/ Athey and Wager.
To answer my own question, this is the kind of thing I was looking for:
https://sites.google.com/site/jeremylhour/courses
With notes here:
https://drive.google.com/file/d/1L_iervUBKj3RsXHLEGOtAFlyHEHpmyT4/viewwow, these are excellent lecture notes.

I feel like you got no serious answer so far.
100% agree with this statement!
Look, stop looking for "highdimensional econometrics". Just learn econometrics; any good econometrics textbook or course mostly deals with many variables.
Also take a multivariable methods/analysis class. Both will teach you most of what you need to know.I know econometrics. There are some important differences in high dimensions, from what I can tell. Some of Belloni and Chernozhukov's work, for example, applies in the case where you have as many instruments as observations. They have other work on selecting instruments in a datadriven way using the lasso. Neither of these are covered in the econometrics field courses I took.
Separately and also by Chernozhukov and coauthors there is the double machine learning literature. Or work on causal forests associated w/ Athey and Wager.
To answer my own question, this is the kind of thing I was looking for:
https://sites.google.com/site/jeremylhour/courses
With notes here:
https://drive.google.com/file/d/1L_iervUBKj3RsXHLEGOtAFlyHEHpmyT4/viewwow, these are excellent lecture notes.
I was precisely gonna recommend the double ML literature and athey's papers. I know that chernozukov is currently teaching a subject on HD metrics at MIT (called something like "inference using ML"). Try to see if you know someone who could send you their notes.
Otherwise, Goldsmithpinkham has a summary of some of these topics in his notes on applied methods, and they are public on YouTube and GitHub. I hope it helps :)

If you have many dimensions of unobserved heterogeneity, you can look at fox et al (2016) to recover the joint distribution with a simple linear solution. In general I would also take a look at Bonhomme 2021 (discretizing unobserved heterogeneity). The idea is that you have have fixed effects in a complicated problem you may get around the incidental parameter problem by doing exante clustering and using group effects instead which are asymptotically consistent.

Wouldn't you just apply PCA and then reg y x.
This could bias your betas as well as hurt forecasting performance. Principal component of xx is a linear combination of x's. The weights in this linear combination are selected in such a way that the principal component explains max possible "average" share of variance of x's (=of xx). The weights are constructed taking into account the correlation structure of [xx] (< we are maximizing the explained variance of the xx object) and not the correlation structure of [y,xx] (< PCA does not care about this object). Now imagine an extreme case where one of the x's is important for explaining y, but is very poorly correlated with the rest of the xx. Pca will assign a weight close to zero to this x, since including it in the linear combination does little to increase the average share of the explained total variance of the xx object. Basically, this is an omitted variable regression case.

Wouldn't you just apply PCA and then reg y x.
This could bias your betas as well as hurt forecasting performance. Principal component of xx is a linear combination of x's. The weights in this linear combination are selected in such a way that the principal component explains max possible "average" share of variance of x's (=of xx). The weights are constructed taking into account the correlation structure of [xx] (< we are maximizing the explained variance of the xx object) and not the correlation structure of [y,xx] (< PCA does not care about this object). Now imagine an extreme case where one of the x's is important for explaining y, but is very poorly correlated with the rest of the xx. Pca will assign a weight close to zero to this x, since including it in the linear combination does little to increase the average share of the explained total variance of the xx object. Basically, this is an omitted variable regression case.
That basically means you have chosen the wrong X's.
Garbage in, garbage out. 
Wouldn't you just apply PCA and then reg y x.
This could bias your betas as well as hurt forecasting performance. Principal component of xx is a linear combination of x's. The weights in this linear combination are selected in such a way that the principal component explains max possible "average" share of variance of x's (=of xx). The weights are constructed taking into account the correlation structure of [xx] (< we are maximizing the explained variance of the xx object) and not the correlation structure of [y,xx] (< PCA does not care about this object). Now imagine an extreme case where one of the x's is important for explaining y, but is very poorly correlated with the rest of the xx. Pca will assign a weight close to zero to this x, since including it in the linear combination does little to increase the average share of the explained total variance of the xx object. Basically, this is an omitted variable regression case.
So, you are saying that PCA is not a tool to heal a badly specified model? Who whould've though?

Wouldn't you just apply PCA and then reg y x.
This could bias your betas as well as hurt forecasting performance. Principal component of xx is a linear combination of x's. The weights in this linear combination are selected in such a way that the principal component explains max possible "average" share of variance of x's (=of xx). The weights are constructed taking into account the correlation structure of [xx] (< we are maximizing the explained variance of the xx object) and not the correlation structure of [y,xx] (< PCA does not care about this object). Now imagine an extreme case where one of the x's is important for explaining y, but is very poorly correlated with the rest of the xx. Pca will assign a weight close to zero to this x, since including it in the linear combination does little to increase the average share of the explained total variance of the xx object. Basically, this is an omitted variable regression case.
That basically means you have chosen the wrong X's.
Garbage in, garbage out.In macro, most folks toss in all the data that exists (say FredMD) and don't use preselection procedures. This works well if the idea is to proxy for the main aggregate shocks driving the economy. But if you're targeting a more granular variable, then yeah, you need to preselect or use something like 3PRF.

Wouldn't you just apply PCA and then reg y x.
This could bias your betas as well as hurt forecasting performance. Principal component of xx is a linear combination of x's. The weights in this linear combination are selected in such a way that the principal component explains max possible "average" share of variance of x's (=of xx). The weights are constructed taking into account the correlation structure of [xx] (< we are maximizing the explained variance of the xx object) and not the correlation structure of [y,xx] (< PCA does not care about this object). Now imagine an extreme case where one of the x's is important for explaining y, but is very poorly correlated with the rest of the xx. Pca will assign a weight close to zero to this x, since including it in the linear combination does little to increase the average share of the explained total variance of the xx object. Basically, this is an omitted variable regression case.
So, you are saying that PCA is not a tool to heal a badly specified model? Who whould've though?
Yes, this is obvious to you. But perhaps it was not obvious to the person that wrote the first comment cited within this post.