As one of the blind bitches, my best advice for alt text is to lead with the main context in a single sentence summary and get more specific later if it's relevant. Alt text is read in the order it's written: if a summary is short and simple, I can know if it's something I care about listening to the whole of.
"A photo of an orange cat stretched out in the sun on a window ledge", for example, gives me the subject matter immediately - it's a photo of a cat - and the detail descends from there. Anything else in the image is coincidence or unnecessary; the photo was taken of the cat, and anything else in the frame is unimportant. The reason why the image exists should be in the first two lines - and comedic timing still works in alt text form! "A photo of an orange cat stretched out in the sun on a window ledge. A second cat is falling off a cat tree in the background." still gives that moment of realization that a build up to a joke usually would.
(Defining if it's a real thing or an illustration or a movie scene or whatever is also pretty important for context - "an illustration of a dead dove" is pretty different from "a photograph of a dead dove".)
"A sunny room with a large window and a park outside with children playing in it. There is a wide, sunny windowsill with plants on it and a cat lying next to them, looking outside" describes the same hypothetical image, but the order of it changes the importance; while it would work to establish a scene in fiction (well, clumsily worded fiction, at least) it's missing the point as alt text - the cat's the reason the photo was taken, but everything else gets described first!
I'm no expert, nor do I intend to speak for Everyone With Vision Loss Ever, but as endemiccharm said, unless the details are relevant to why the image exists, they're probably not necessary to mention! Get Shorter.