Keir Regan-Alexander, Author at AEC Magazine https://aecmag.com/author/keirreganalexander/ Technology for the product lifecycle Sun, 12 Oct 2025 08:33:10 +0000 en-GB hourly 1 https://aecmag.com/wp-content/uploads/2021/02/cropped-aec-favicon-32x32.png Keir Regan-Alexander, Author at AEC Magazine https://aecmag.com/author/keirreganalexander/ 32 32 Ai & design culture (part 2) https://aecmag.com/ai/ai-design-culture-part-2/ https://aecmag.com/ai/ai-design-culture-part-2/#disqus_thread Thu, 24 Jul 2025 06:00:16 +0000 https://aecmag.com/?p=24365 How architects are using Ai models and how Midjourney V7 compares to Stable Diffusion and Flux

The post Ai & design culture (part 2) appeared first on AEC Magazine.

]]>
In the second of a two part article on Ai image generation and the culture behind its use, Keir Regan-Alexander gives a sense of how architects are using Ai models and takes a deeper dive into Midjourney V7 and how it compares to Stable Diffusion and Flux

In the first part of this article I described the impact of new LLM-based image tools like GPT-Image-1 and Gemini 2.0.Flash (Experimental Image Mode).

Now, in this second part I turn my focus to Midjourney, a tool that has recently undergone a few pivotal changes that I think are going to have a big impact on the fundamental design culture of practices. That means that they are worthy of critical reflection as practices begin testing and adopting:

Keir Regan-Alexander
Click the image to read Part 1

1) Retexture – Reduces randomness and brings “control net” functionality to Midjourney (MJ). This means rather than starting with random form and composition, we give the model linework or 3D views to work from. Previously, despite the remarkable quality of image outputs, this was not possible in MJ.

2) Moodboards – Make it easy to very quickly “train your own style” with a small collection of image references. Previously we have had to train “LoRAs” in Stable Diffusion (SD) or Flux, taking many hours of preparation and testing. Moodboards provide a lower fidelity but much more convenient alternative.

3) Personal codes – Tailors your outputs to your taste profile using ‘Personalize’ (US spelling). You can train your own “–p” code by offering up hundreds of your own A/B test preferences within your account – you can then switch to your ‘taste’ profile extremely easily. In short, once you’ve told MJ what you like, it gets a whole lot better at giving it back to you each time.

A model that instantly knows your aesthetic preferences

Personal codes (or “Personalization” codes to be more precise) allow us to train MJ on our style preferences for different kinds of image material. To better understand the idea, in Figure 1 below you’ll see a clear example of running the same text prompt both with and without my “–p” code. For me there is no contest, I consistently massively prefer the images that have applied my –p code as compared to those that have not.


Keir Regan-Alexander
(Left) an example of a generic MJ output, from a text prompt. The subject is a private house design in Irish landscape. (Right) an output running the exact same prompt, but applying my personal “–p” code, which is trained on my preferences of more than 450 individual A/B style image rankings

When enabled, Personalization substantially improves the average quality of your output, everything goes quickly from fairly generic ‘meh’ to ‘hey!’. It’s also now possible to develop a number of different personal preference codes for use in different settings. For example, one studio group or team may have a desire to develop a slightly different style code of preferences to another part of the studio, because they work in a different sector with different methods of communication.


Find this article plus many more in the July / August 2025 Edition of AEC Magazine
👉 Subscribe FREE here 👈


Midjourney vs Stable Diffusion / Flux

In the last 18 months, many heads have been turned by the potential of new tools like Stable Diffusion in architecture, because they have allowed us to train our own image styles, render sketches and gain increasingly configurable controls over image generation using Ai – and often without even making a 3D model. Flux, a new parallel opensource model ecosystem has taken the same methods and techniques from SD and added greater levels of quality.

We may marvel at what Ai makes possible in shorter time frames, but we should all be thinking – “great, let’s try to make a bit more profit this year” not “great let’s use this to undercut my competitor

But for ease of use, broad accessibility and consistency of output, the closed-source (and paid product) Midjourney is now firmly winning for most practices I speak to that are not strongly technologically minded.

Anecdotally, when I do Ai workshops, perhaps 10% of attendees really ‘get’ SD, whereas more like 75% immediately tend to click with Midjourney and I find that it appeals to the intuitive and more nuanced instincts of designers who like to discover design through an iterative and open-ended method of exploration.

While SD & Flux are potentially very low cost to use (if you run them locally and have the prerequisite GPUs) and offer massive flexibility of control, they are also much much harder to use effectively than MJ and more recently GPT-4o.

For a few months now Midjourney now sits within a slick web interface that is very intuitive to use and will produce top quality output with minimal stress and technical research.

Before we reflect on what this means for the overall culture of design in architectural practice going forwards, here are two notable observations to start with:

1) Practices who are willing to try their hand with diffusion models during feasibility or competition stage are beginning to find an edge. More than one recent conversation is suggesting that the use of diffusion models during competition stages has made a pivotal difference to recent bid processes and partially contributed to winning proposals.

2) I now see a growing interest from my developer client base, who want to go ahead and see vivid imagery even before they’ve engaged an architect or design team – they simply have an idea and want to go directly to seeing it visualised. In some cases, developers are looking to use Ai imagery to help dispose of sites, to quickly test alternative (visual) options to understand potential, or to secure new development contracts or funding.

Make of that what you will. I’m sure many architects will be cringing as they read that, but I think both observations are key signals of things to come for the industry whether it’s a shift you support or not. At the same time, I would say there is certainly a commercial opportunity there for architects if they’re willing to meet their clients on this level, adjust their standard methods of engagement and begin to think about exactly what value they bring in curating initial design concepts in an overtly transparent way at the inception stage of a project.

Text vs Image – where are people focused?

While I believe focusing on LLM adoption currently offers the most immediate and broadest benefits across practice and projects – the image realm is where most architects are spending their time when they jump into Generative Ai.

If you’re already modelling every detail and texture of your design and you want finite control, then you don’t use an Ai for visualisation, just continue to use CGI

Architects are fundamentally aesthetic creatures and so perhaps unsurprisingly they assume the image and modelling side of our work will be the most transformed over time. Therefore, I tend to find that architects often really want to lean into image model techniques above alternative Ai methods or Generative Design methods that may be available.

In the short term, image models are likely to be the most impactful for “storytelling” and in the initial briefing stages of projects where you’re not really sure what you think about a distinctive design approach, but you have a framework of visual and 3D ideas you want to play with.

Mapping diffusion techniques to problems

If you’re not sure what all of this means, see table below for a simple explanation of these techniques mapped to typical problems faced by designers looking to use Ai image models.


Keir Regan-Alexander

Changes with Midjourney v7

Midjourney recently launched its v7 model and it was met with relatively muted praise, probably because people were so blown away by the ground breaking potential of GPT-image-1 (an auto-regression model) that arrived just a month before.

This latest version of the MJ model was trained entirely from scratch so as a result it behaves differently to the familiar v6.1 model. I’m finding myself switching between v7 and 6.1 more regularly than with any previous model release.

One of the striking things about v7 is that you can only access the model when you have provided at least 200+ “image rating” preferences which points to an interesting new direction for more customised Ai experiences. Perhaps Midjourney has realised that the personalisation that is now possible in the platform is exactly what people want in an age of abundant imagery (increasingly created with Ai).


Keir Regan-Alexander
Example of what the new MJ v7 model can do. (Left) an image set in Hamburg, created with a simple text to image prompt. (Right) a nighttime view of the same scene, created by ‘retexturing’ the left hand image within v7 and with ‘personalize’ enabled. The output is impressive because it’s very consistent with the input image and the transformation (in the fore and mid-ground parts of the image are very well executed).

I for one, much prefer using a model that feels like it’s tuned just for me – more broadly, I suspect users want to feel like only they can produce the images they create and that they have a more distinctive style as a result. Leaning more into “Personalize” mode is helping with that and I like that MJ gating access to v7 behind the image ranking process.

I have achieved great results with the new model, but I find it harder to use and you do need to work differently with it. Here is some initial guidance on best use:

  • v7 has a new function called ‘draft’ mode which produces low-res options very fast. I’m finding that to get the best results in this version you have to work in this manner, first starting with draft mode enabled and then enhancing to larger resolution versions directly from there. It’s almost like draft mode helps v7 work out the right composition from the prompt and then enhance mode helps to refine the resolution from there. If you try to go for full res v7 in one rendering step, you’ll probably be confused by the lower-par output.
  • Getting your “personalize” code is essential for accessing v7 and I’m finding my –p code only begins to work relatively effectively from about 1,000+ rankings, so set aside a couple of hours to train your preferences in.
  • You can now prompt with voice activation mode, which means having a conversation about the composition and image type you are looking for. As you speak v7 will start producing ideas in front of you.

Letting the model play

Image models improvise and this is their great benefit. They aren’t the same as CGI.

The biggest psychological hurdle that teams have to cross in the image realm is to understand that using Ai diffusion models is not like rendering in the way we’ve become accustomed to – it’s a different value proposition. If you’re already modelling every detail and texture of your design and you want finite control, then you don’t use an Ai for visualisation, just continue to use CGI.

However, if you can provide looser guidance with your own design linework before you’ve actually designed the fine detail, feeding inputs for the overall 3D form and imagery for textures and materials, then you are essentially allowing the model to play within those boundaries.

This means letting go of some control and seeing what the model comes back with – a step that can feel uncomfortable for many designers. When you let the model play within boundaries you set, you likely find striking results that change the way you’re thinking about the design that you’re working on. You may at times find yourself both repulsed and seduced in short order as you search around through one image to the next, searching for a response that lands in the way you had hoped.

A big shift that I’m seeing is that Midjourney is making “control net” type work and “style transfer” with images accessible to a much wider audience than would naturally be inclined to try out a very technical tool like SD.


Keir Regan-Alexander
Latest updates from Midjourney now allow control net drawing inputs (left), meaning for certain types of view we can go from hidden line design frameworks to rendered concept imagery, or with a further step of complexity, training our own ‘moodboard’ to apply consistent styling (right). Note, this technique works best for ‘close-up’ subjects

I think that Midjourney’s decision to finally take the tool out of the very dodgy feeling Discord and launching a proper new and easy to use UI has really made the difference to practices. I still love to work with SD most of all, but I really see these ideas are beginning to land in MJ because it’s just so much easier to get a good result first time and it’s become really delightful to use.

Midjourney has a bit more work to do on its licence agreements (it is currently setup for single prosumers rather than enterprise) and privacy (they are training on your inputs). While you may immediately rule the tool out on this basis, consider – in most cases your inputs are primitive sketches or Enscape white card views, do you really mind if they are used for training and do they give away anything that would be considered privileged? With Stealth mode enabled (which you have to be on pro level for), your work can’t be viewed in public galleries. In order to get going with Midjourney in practice, you will need to allay all current business concerns, but with some basic guardrails in place for responsible use I am now seeing traction in practice.

Looking afresh at design culture

The use of “synthetic precedents” (i.e. images made purely with Ai already) is also now beginning to shape our critical thinking about design in early stages. Midjourney which has an exceptional ability to tell vivid first-person stories around projects, design themes and briefs, with seductive landscapes, materials and atmosphere. From the evidence I’ve seen so far, the images very much appeal to clients.

We are now starting to see Ai imagery be pinned up on the wall for studio crits and therefore I think we need to consider the impact of Ai on the overall design culture of the profession.


Keir Regan-Alexander
Example of sketch-to-render using Midjourney, but including style transfer. In this case a “synthetic precedent” is used to seed the colour and material styles to the final render using –sref tool.

If we put Ai aside for a moment – in architectural practice, I think it’s a good idea to regularly reflect on your current studio design culture by considering first;

  • Are we actually setting enough time aside to talk about design or is it all happening ad-hoc at peoples’ desks or online?
  • Do we share a common design method and language that we all understand implicitly?
  • Are we progressing and getting better with each project?
  • Are all team members contributing to the dialogue or waiting passively to be told what to do by a director with a napkin sketch?
  • Are we reverting to our comfort zone and just repeating tired ideas? • Are we using the right tools and mediums to explore each concept?

When people express frustration with design culture, they often refer specifically to some aspect of technological “misuse”, for example;

  1. “People are using SketchUp too much. They’re not drawing plans anymore”
  2. “We are modelling everything in Revit at Stage 3, and no one is thinking about interface detailing”
  3. “All I’m seeing is Enscape design options wall to wall. I’m struggling to engage”
  4. “I think we might be relying too heavily on Pinterest boards to think about materials”, or maybe;
  5. “I can’t read these computer images. I need a model to make a decision”.

… all things I’ve heard said in practice.

Design culture has changed a lot since I entered the profession, and I have found that our relationship with the broad category of “images” in general has changed dramatically over time. Perhaps this is because we used to have to do all our design research collecting monograph books and by visiting actual buildings to see them, whereas now I probably keep up to date on design in places like Dezeen or Arch Daily – platforms that specifically glorify the single image icon and that jump frenetically across scale, style and geography.

One of the great benefits of my role with Arka Works is that I get to visit so many design studios (more than 70 since I began) and I’m seeing so many different ways of working and a full range of opinions about Ai.

I recently heard from a practice leader who said that in their practice, pinning up the work of a deceased (and great) architect was okay, because if it’s still around it must have stood the test of time and also presumably it’s beyond the “life plus 70 year Intellectual Property rule” – but in this practice the random pinning up of images was not endorsed.

Other practice leads have expressed to me that they consider all design work to be somehow derivative and inspired by things we observe – in other words – it couldn’t exist without designers ruminating on shared ideas, being enamoured of another architects’ work, or just plain using peoples’ design material as a crib sheet. In these practices, you can pin up whatever you like – if it helps to move the conversation forward.

Some practices have specific rules about design culture – they may require a pin up on a schedule with a specific scope of materials – you might not be allowed to show certain kinds of project imagery, without a corresponding plan, for example (and therefore holistic understanding of the design concepts). Maybe you insist on models or prefer no renders.

I think those are very niche cases. More often I see images and references simply being used as a shortcut for words and I also think we are a more image-obsessed profession than ever. In my own experience so far, I think these new Ai image tools are extremely powerful and need to be wielded with care, but they absolutely can be part of the design culture and have a place in the design review, if adopted with good judgement.

This is an important caveat. The need for critical judgment at every step is absolutely essential and made all the more challenging by how extraordinary the outputs can be – we will be easily seduced into thinking “yes that’s what I meant”, or “that’s not exactly what I meant, but it’ll do”, or worse “that’s not at all what I meant, but the Ai has probably done a better job anyway – may as well just use Ai every time from now on.”

Pinterestification

This shortening of attention spans is a problem we face in all realms of popular culture, as we become more digital every day. We worry that quality will suffer as people’s attention spans cause more laziness around design idea creation and testing – which would cause a broad dumbing down effect. This has been referred to as the ‘idiot trap’, where we rely so heavily on subcontracting thinking to various Ais, that we forget how to think from first principles.

You might think as a reaction – “well let’s just not bother using Ai altogether” and I think that’s a valid critique if you believe that architectural creativity is a wholly artisanal and necessarily human crafted process.

Probably the practices that feel that way just aren’t calling me to talk about Ai, but you would be surprised by the kind of ‘artisanal’ practices who are extremely interested in adopting Ai image techniques because rather than seeing them as a threat, they just see it as another way of exercising and exploring their vision with creativity.

Perhaps you have observed something I call “Pinterestification” happening in your studio?

I describe this as the algorithmic convergence of taste around common tropes and norms. If you pick a chair you like in Pinterest, it will immediately start nudging you in the direction of living room furniture, kitchen cabinets and bathroom tiles that you also just happen to love.

They all go so well on the mood board…

It’s almost like the algorithm has aggregated the collective design preferences of millions of tastemakers and packaged it up onto a website with convenient links to buy all the products we hanker after and that’s because it has.


Keir Regan-Alexander
(Left) a screenshot from the “ArkaPainter_MJ” moodboard, which is a selection of 23 synthetic training images, the exact same selection that were recently used to train an SD LoRA with similar style. (Right) the output from MJ applies the paint and colour styles of the moodboard images into a new setting – in this case the same kitchen drawing as presented previously

Pinterest is widely used by designers and now heavily relied upon. The company has mapped our clicks; they know what goes together, what we like, what other people with similar taste like – and the incentives of ever greater attention mean that it’s never in Pinterest’s best interest to challenge you. Instead, Pinterest is the infinite design ice cream parlour that always serves your favourite flavour; it’s hard to stop yourself going back every time.

Learning about design

I’ve recently heard that some universities require full disclosure of any Ai use and that in other cases it can actually lead to disciplinary action against the student. The academic world is grappling with these new tools just as practice is, but with additional concerns about how students develop fundamental design thinking skills – so what is their worry?

The tech writer Paul Graham once said “writing IS thinking” and I tend to agree. Sure, you could have an LLM come up with a stock essay response – but the act of actually thinking by writing down your words and editing yourself to find out where you land IS the whole point of it. Writing is needed to create new ideas in the world and to solve difficult problems. The concern from universities therefore is that if we stop writing, we will stop thinking.

For architects, sketching IS our means of design thinking – it’s consistently the most effective method of ‘problem abstraction’ that we have. If I think back to most skilful design mentors I had in my early career, they were ALL expert draftspeople.

That’s because they came up with the drawing board and what that meant was they could distil many problems quickly and draw a single thread through things to find a solution, in the form of an erudite sketch. They drew sparingly, putting just the right amount of information in all the right places and knowing when to explore different levels of detail – because when you’re drawing by hand, you have to be efficient – you have to solve problems as you go.

Someone recently said to me that the less time the profession has spent drawing by hand (by using CAD, Revit, or Ai), the less that architects have earned overall. This is indeed a bit of a mind puzzle, and the crude problem is that when a more efficient technology exists, we are forced into adoption because we have to compete for work, whether it’s in our long term interests or not – it’s a Catch 22.

But this observation contains a signal too; that immaculate CAD lines do a different job from a sketching or hand drawing. The sketch is the truly high-value solution, the CAD drawing is the prosaic instructions for how to realise it.

I worry that “the idiot trap” for architects would be losing the fundamental skills of abstract reasoning that combines spatial, material, engineering and cultural realms and in doing so failing to recognise this core value as being the thing that the client is actually paying for (i.e. they are paying for the solution, not the instructions).

Clients hire us because we can see complete design solutions and find value where others can’t and because we can navigate the socio-political realm of planning and construction in real life – places where human diplomacy and empathy are paramount.

They don’t hire us to simply ‘spend our time producing package information’ – that is a by-product and in recent years we’ve failed to make this argument effectively. We shouldn’t be charging “by the time needed to do the drawing”, we should be charging “by the value” of the building.

So as we consider things being done more quickly with Ai image models, we need to build consensus that we won’t dispense with the sketching and craft of our work. We have to avoid the risk of simply doing something faster and giving the saving straight back to the market in the form of reduced prices and undercutting. We may marvel at what Ai makes possible in shorter time frames, but we should all be thinking – “great, let’s try to make a bit more profit this year” not “great let’s use this to undercut my competitor”.

Conclusion: judicious use

There is a popular quote (by Joanna Maciejewska) that has become a meme online:

I want Ai to do my laundry and dishes, so that I can do art and writing, not for Ai to do my art and writing so that I can do my laundry and dishes

If we translate that into our professional lives, for architects that would probably mean having Ai assisting us with things like regulatory compliance and auditing, not making design images for us.

Counter-intuitively Ai is realising value for practices in the very areas we would previously have considered the most difficult to automate: design optioneering, testing and conceptual image generation.

When architects reach for a tool like Midjourney, we need to be aware that these methods go right to the core of our value and purpose as designers. More so, that Ai imagery forces us to question our existing culture of design and methods of critique.

Unless we expressly dissuade our teams from using tools like Midjourney (which would be a valid position), anyone experimenting with it will now find it to be so effective that it will inevitably percolate into our design processes in ways that we don’t control, or enjoy.

Rather than allow these ad-hoc methods to creep up on us in design reviews unannounced and uncontrolled, a better approach is to consider first what would be an ‘aligned’ mode of adoption within our design processes – one that fits with the core culture and mission of the practice and then to make more deliberate use of it with endorsed design processes that create repeatable outputs that we really appreciate.


Keir Regan-Alexander
Photo taken during a design review at Morris+Company in 2022 – everyone standing up, drawings pinned up, table of material samples, working models, coffee cups. How will Ai imagery fit into this kind of crit setting? Should it be there at all? (photo: Architects from left to right: Kehinde, Funmbi, Ben, Miranda & David)

If you have a particularly craft-based design method, you could consider how that mode of thinking would be applied that to your use of Ai? Can you take a particularly experimental view of adoption that aligns with your specific priorities? Think Archigram with the photocopier.

We also need to question when something is pinned up on a wall alongside other material, if it can be judged objectively on its merits and relevance to the project, and if it stands up to this test – does it really matter to us how it was made? If I tell you it’s “Ai generated” does it reduce its perceived value?

I find that experimentation with image models is best led by the design leaders in practice because they are the “tastemakers” of practice and usually create the permission structures around design. Image models are often mistakenly categorised as technical phenomena and while they require some knowledge and skill, they are actually far more integral to the aesthetic, conceptual and creative aspects of our work.

To get a picture of what “aligned adoption of Ai” would mean for your practice, it should feel like you’re turning up the volume on the particular areas of practice that you already excel at, or conversely to mitigate aspects of practice that you feel acutely weaker in.

Put another way – Ai should be used to either reinforce whatever your specialist niche is or to help you remedy your perceived vulnerabilities. I particularly like the idea of leaning into our specialisms because it will make our deployment of Ai much more experimental, more bespoke and more differentiated in practice.

When I am applying Ai in practice, I don’t see depressed and disempowered architects – I am reassured to find that the most effective people at writing bids with Ai, also tend to be some of the best bid writers. The people who end up becoming the most experimental and effective at producing good design images with Ai image models, also tend to be great designers too and this trend goes on in all areas where I see Ai being used judiciously, so far – without exception.

The “judicious use” part is most important because only a practitioner who really knows their craft can apply these ideas in ways that actually explore new avenues for design and realise true value in project settings. If you feel that description matches you – then you should be getting involved and having an opinion about it. In the Ai world this is referred to as keeping the “human-in-the-loop” but we could think of it as the “architect-in-the-loop” continuing to curate decisions, steer things away from creative cul de sacs and to more effectively drive design.


Recommended viewing

Keir Regan-Alexander is director at Arka Works, a creative consultancy specialising in the Built Environment and the application of AI in architecture. At NXT BLD 2025 he explored how to deploy Ai in practice.

CLICK HERE to watch the whole presentation free on-demand

Watch the teaser below

The post Ai & design culture (part 2) appeared first on AEC Magazine.

]]>
https://aecmag.com/ai/ai-design-culture-part-2/feed/ 0
AI and design culture (part 1) https://aecmag.com/ai/ai-design-culture-part-1/ https://aecmag.com/ai/ai-design-culture-part-1/#disqus_thread Wed, 28 May 2025 06:33:54 +0000 https://aecmag.com/?p=23767 Keir Regan-Alexander explores the opportunities and tensions between creativity and computation

The post AI and design culture (part 1) appeared first on AEC Magazine.

]]>
As AI tools rapidly evolve, how are they shaping the culture of architectural design? Keir Regan-Alexander, director of Arka.Works, explores the opportunities and tensions at the intersection of creativity and computation — challenging architects to rethink what it means to truly design in the age of AI

An awful lot has been happening recently in the AI image space, and I’ve written and rewritten this article about three times to try and account for everything. Every time I think it’s done, there seems to be another release that moves the needle. That’s why this article is in two parts; first I want to look at recent changes from Gemini and GPT-4o and then take a deeper dive into Midjourney V7 and give a sense of how architects are using these models.

I’ll start by describing all the developments and conclude by speculating on what I think it means for the culture of design.


Arka Works
(Left) an image used as input (created in Midjourney). (Right) an image returned from Gemini that precisely followed my text-based request for editing

Right off the bat, let’s look at exactly what we’re talking about here. In the figure above you’ll see a conceptual image for a modern kitchen, all in black. This was created with a text prompt in Midjourney. After that I put the image into Gemini 2.0 (inside Google AI Studio) and asked it:

“Without changing the time of day or aspect ratio, with elegant lighting design, subtly turn the lights (to a low level) on in this image – the pendant lights and strip lights over the counter”

Why is this extraordinary?

Well, there is no 3D model for a start. But look closer at the light sources and shadows. The model knew where exactly to place the lights. It knows the difference between a pendant light and a strip light and how they diffuse light. Then it knows where to cast the multi-directional shadows and also that the material textures of each surface would have diffuse, reflective or caustic illumination qualities. Here’s another one (see below). This time I’m using GPT-4o in Image Mode.


Arka Works
(Left) a photograph taken in London on my ride home (building on Blackfriars Road). (Right) GPT-4o’s response to my request, a charming mock up of a model sample board of the facade

Create an image of an architectural sample board based on the building facade design in this image”

Why is this one extraordinary?

Again, no 3D model and with only a couple of minor exceptions, the architectural language of specific ornamentation, materials, colours and proportion have all been very well understood. The image is also (in my opinion) very charming. During the early stages of design projects, I have always enjoyed looking at the local “Architectural Taxonomy” of buildings in context and this is a great way of representing it.

If someone in my team had made these images in practice I would have been delighted and happy for them to be included in my presentations and reports without further amendment.

A radical redistribution of skills

There is a lot of hype in AI which can be tiresome, and I always want to be relatively sober in my outlook and to avoid hyperbole. You will probably have seen your social media feeds fill with depictions of influencers as superhero toys in plastic wrappers, or maybe you’ve observed a sudden improvement in someone’s graphic design skills and surprisingly judicious use of fonts and infographics … that’s all GPT-4o Image Mode at work.


Find this article plus many more in the May / June 2025 Edition of AEC Magazine
👉 Subscribe FREE here 👈


So, despite the frenzy of noise, the surges of insensitivity towards creatives and the abundance of Studio Ghibli IP infringement surrounding this release – in case it needs saying just one more time – in the most conservative of terms, this is indeed a big deal.

The first time you get a response from these new models that far exceeds your expectations, it will shock you and you will be filled with a genuine sense of wonder. I imagine the reaction feels similar to the first humans to see a photograph in the early c19th – it must have seemed genuinely miraculous and inexplicable. You feel the awe and wonder, then you walk away and you start to think about what it means for creators, for design methods … for your craft … and you get a sinking feeling in your stomach. For a couple of weeks after trying these new models for the first time I had a lingering feeling of sadness with a bit of fear mixed in.

These techniques are so accessible in nature that we should expect to see our clients briefing us with ever-more visual material. We therefore need to not be afraid or shocked when they do

I think this feeling was my brain finally registering the hammer dropping on a long-held hunch; that we are in an entirely new industry whether we like it or not and even if we wanted to return to the world of creative work before AI, it is impossible. Yes, we can opt to continue to do things however we choose, but this new method now exists in the world and it can’t be put back in the box.

I’ll return to this internal conflict again in my conclusion. If we set aside the emotional reaction for a moment, the early testing I’ve been doing in applying these models to architectural tasks suggest that, in both cases, the latest OpenAI and Google releases could prove to be “epoch defining” moments for architects and for all kinds of creatives who work in the image and video domains.

This is because the method of production and the user experience is so profoundly simple and easy compared to existing practices, that the barrier for access to image production in many, many realms has now come right down.

Again, we may not like to think about this from the perspective of having spent years honing our craft, yet the new reality is right in front of us and it’s not going anywhere. These new capabilities from image models can only lead to a permanent change in the working relationship between the commissioning client and the creative designer, because the means of production for graphical and image production have been completely reconfigured. In a radical act of forced redistribution, the access to sophisticated skill sets is now being packaged up by the AI companies to anyone who pays the licence fee.

What has not become distributed (yet) is wise judgement, deep experience in delivery, good taste, entirely new aesthetic ideas, emotional human insight, vivid communication and political diplomacy; all attributes that come with being a true expert and practitioner in any creative and professional realm.

These are qualities that for now remain inalienable and should give a hint at where we have to focus our energies in order to ensure we can continue to deliver our highest value for our patrons, whomever they may be. For better or worse, soon they will have the option to try and do things without us.

Chat-based image creation & editing

For a while, attempting to produce or edit images within chat apps has produced only sub-standard results. The likes of “Dall-E” which could be accessed only within otherwise text-based applications had really fallen behind and were producing ‘instantly AI identifiable images’ that felt generic and cheesy. Anything that is so obviously AI created (and low quality) means that we instantly attribute a low value to it.

As a result, I was seeing designers flock instead to more sophisticated options like Midjourney v6.1 and Stable Diffusion SDXL or Flux, where we can be very particular about the level of control and styling and where the results are often either indistinguishable from reality or indistinguishable from human creations. In the last couple of months that dynamic has been turned upside down; people can now achieve excellent imagery and edits directly with the chat-based apps again.

The methods that have come before, such as MJ, SD and Flux are still remarkable and highly applicable to practice – but they all require a fair amount of technical nous to get consistent and repeatable results. I have found through my advisory work with practices that having a technical solution isn’t what matters most’ it’s having it packaged up and made enjoyable enough to use that it’s able to make change to rigid habits.

A lesser tool with a great UX will beat a more sophisticated tool with a bad UX every time.

These more specialised AI image methods aren’t going away, and they still represent the most ‘configurable’ option, but text-based image editing is a format that anyone with a keyboard can do, and it is absurdly simple to perform.

More often than not, I’m finding the results are excellent and suitable for immediate use in project settings. If we take this idea further, we should also assume that our clients will soon be putting our images into these models themselves and asking for their ideas to be expressed on top…


Arka Works
(Left) Image produced in Midjourney (Right) Gemini has changed the cladding to dark red standing seam zinc and also changed the season to spring. The mountains are no longer visible but the edit is extremely high quality.

We might soon hear our clients saying; “Try this with another storey”, “Try this but in a more traditional style”, “Try this but with rainscreen fibre cement cladding”, “Try this but with a cafe on the ground floor and move the entrance to the right”, “Try this but move the windows and make that one smaller”…

You get the picture.

Again, whether we like this idea or not (and I know architects will shudder even thinking of this), when our clients received the results back from the model, they are likely to be similarly impressed with themselves, and this can only lead to a change in briefing methods and working dynamics on projects.

To give a sense of what I mean exactly, in the image below I’ve included an example of a new process we’re starting to see emerge whereby a 2D plan can be roughly translated into a 3D image using 4o in Image Mode. This process is definitely not easy to get right consistently (the model often makes errors) and also involves several prompting steps and a fair amount of nuance in technique. So far, I have also needed to follow up with manual edits.


Arka Works
(Left) Image produced in Midjourney using a technique called ‘moodboards’. (Right) Image produced in GPT-4o Image Mode with a simple text prompt

Despite those caveats, we can assume that in the coming months the models will solve these friction points too. I saw this idea first validated by Amir Hossein Noori (co-founder of the AI Hub) and while I’ve managed to roughly reproduce his process, he gets full credit for working it out and explaining the steps to me – suffice to say it’s not as simple as it first appears!

Conclusion: the big leveller

1) Client briefing will change

My first key conclusion from the last month is that these techniques are so accessible in nature that we should expect to see our clients briefing us with ever-more visual material. We therefore need to not be afraid or shocked when they do.

I don’t expect this shift to happen overnight, and I also don’t think all clients will necessarily want to work in this way, but over time it’s reasonable to expect this to become much more prevalent and this would be particularly the case for clients who are already inclined to make sweeping aesthetic changes when briefing on projects.

Takeaway: As clients decide they can exercise greater design control through image editing, we need to be clearer than ever on how our specialisms are differentiated and to be able to better explain how our value proposition sets us apart. We should be asking; what are the really hard and domain-specific niches that we can lean into?

2) Complex techniques will be accessible to all

Next, we need to reconsider technical hurdles as being a ‘defensive moat’ for our work. The most noticeable trend in the last couple of years is that the things that appear profoundly complicated at first, often go on to become much more simple to execute later.

As an example, a few months ago we had to use ComfyUI (a complex node-based interface for using Stable Diffusion) for ‘re-lighting’ imagery. This method remains optimal for control, but now for many situations we could just make a text request and let the model work out how to solve it directly. Let’s extrapolate that trend and assume that as a generalisation; the harder things we do will gradually become easier for others to replicate.

Muscle memory is also a real thing in the workplace, it’s often so much easier to revert back to the way we’ve done things in the past. People will say “Sure it might be better or faster with AI, but it also might not – so I’ll just stick with my current method”. This is exactly the challenge that I see everywhere and the people who make progress are the ones who insist on proactively adapting their methods and systems.

The major challenge I observe for organisations through my advisory work is that behavioural adjustments to working methods when you’re under stress or a deadline are the real bottleneck. The idea here is that while a ‘technical solution’ may exist, change will only occur when people are willing to do something in a new way. I do a lot of work now on “applied AI implementation” and engagement across practice types and scales. I see again and again that there are pockets of technical innovation and skills with certain team members, but I also see that it’s not being translated into actual changes in the way people do things across the broader organisation. This is a lot to do with access to suitable training, but also to do with a lack of awareness that improving working methods are much more about behavioural incentives than they are about ‘technical solutions’.

In a radical act of forced redistribution, the access to sophisticated skill sets are now being packaged up by the AI companies to anyone who pays the licence fee

There is an abundance of new groundbreaking technology now available to practices, maybe even too much – we could be busy for a decade with the inventions of the last couple of years alone. But in the next period, the real difference maker will not be technical, it will be behavioural. How willing are you to adapt the way you’re working and try new things? How curious is your team? Are they being given permission to experiment? This could prove a liability for larger practices and make smaller, more nimble practices more competitive.

Takeaway: Behavioural change is the biggest hurdle. As the technical skills needed for the ‘means of creative production become more accessible to all, the challenge for practices in the coming years may not be all about technical solutions, it will be more about their willingness and ability to adjust behaviour and culture. The teams who succeed won’t be the people who have the most technically accomplished solutions, more likely it will be those who achieve the most widespread and practical adaptations of their working systems.

3) Shifting culture of creativity

I’ve seen a whole spectrum of reactions towards Google and OpenAI’s latest releases and I think it’s likely that these new techniques are causing many designers a huge amount of stress as they consider the likely impacts on their work. I have felt the same apprehension many times too. I know that a number of ‘crisis meetings’ have taken place in creative agencies for example, and it is hard for me to see these model releases as anything other than a direct threat to at least a portion of their scope of creative work.

This is happening to all industries, not least across computer science, after all – LLMs can write exceptional code too. From my perspective, it’s certainly coming for architecture as well, and if we are to maintain the architect’s central role in design and place making, we need to shift our thinking and current approach or our moat will gradually be eroded too.

The relentless progression of AI technology cares little about our personal career goals and business plans and when we consider the sense of inevitability of it all – I’m left with a strong feeling that the best strategy is actually to run towards the opportunities that change brings, even if that means feeling uncomfortable at first.

Among the many posts I’ve seen celebrating recent developments from thought leaders and influencers seeking attention and engagement, I can see a cynical thread emerging … of (mostly) tech and sales people patting themselves on the back for having “solved art”.


Arka Works
(Left) An example plan of an apartment (AI Hub), with a red arrow denoting the camera position. (Right), a render produced with GPT-4o Image Mode (produced by Arka Works)

The posts I really can’t stand are the cavalier ones that actually seem to rejoice at the idea of not needing creative work anymore and salivating at the budget savings they will make … they seem to think you can just order “creative output” off a menu and that these new image models are a cure for some kind of long held frustration towards creative people.

Takeaway: The model “output” is indeed extraordinarily accomplished and produced quickly, but creative work is not something that is “solvable”; it either moves you or it doesn’t and design is similar — we try to explain objectively what great design quality is, but it’s hard. Certainly it fits the brief – yes, but the intangible and emotional reasons are more powerful and harder to explain. We know it when we see it.

While AIs can exhibit synthetic versions of our feelings, for now they represent an abstracted shadow of humanness – it is a useful imitation for sure and I see widespread applications in practice, but in the creative realm I think it’s unlikely to nourish us in the long term. The next wave of models may begin to ‘break rules’ and explore entirely new problem spaces and when they do I will have to reconsider this perspective.

We mistake the mastery of a particular technique for creativity and originality, but the thing about art is that it comes from humans who’ve experienced the world, felt the emotional impulse to share an authentic insight and cared enough to express themselves using various mediums. Creativity means making something that didn’t exist before.

That essential impulse, the genesis, the inalienably human insight and direction is still for me, everything. As we see AI creep into more and more creative realms (like architecture) we need to be much more strategic about how we value the specifically human parts and for me that means ceasing to sell our time and instead learning to sell our value.

In part 2 I will be looking in depth at Midjourney and how it’s being used in practice, I’ll also be looking specifically at the latest release (V7) in more detail, until then — thanks for reading.


Catch Keir Regan-Alexander at NXT BLDArka Works

Keir Regan-Alexander is director at Arka Works, a creative consultancy specialising in the Built Environment and the application of AI in architecture.

He will be speaking on AI at AEC Magazine’s NXT BLD in London on 11 June.

The post AI and design culture (part 1) appeared first on AEC Magazine.

]]>
https://aecmag.com/ai/ai-design-culture-part-1/feed/ 0
AI is hard (to do well) https://aecmag.com/ai/ai-is-hard-to-do-well/ https://aecmag.com/ai/ai-is-hard-to-do-well/#disqus_thread Fri, 13 Sep 2024 12:57:42 +0000 https://aecmag.com/?p=21401 Generative AI (GenAI) is extremely promising, but achieving tangible results is more complex than the hype suggests.

The post AI is hard (to do well) appeared first on AEC Magazine.

]]>
Generative AI (GenAI) is extremely promising, but achieving tangible results is more complex than the hype suggests. Keir Regan-Alexander, architect and founder of creative consultancy, Arka Works, highlights the challenges of implementing GenAI and offers practical strategies for professional practice

Have you noticed how AI is often written about in two dramatically different ways? a) it’s a silver bullet or b) it’s a scam. – On the one hand, it’s presented as a Frankensteinian invention that is changing the employment landscape for many knowledge workers.

On the other, people are becoming exasperated by hyped-up claims and over-promises being pushed by “corporate grifters”.

But what if neither extreme is wholly true and GenAI proves to be … c) worth the effort and much liked by the employees who use it in the real-world?

This technology is beguiling because it presents itself as friendly and simple. Ask a question and, like popping corn, it comes noisily to life. This sweet, surface level experience is why there is such a widespread conceit on LinkedIn that using Generative AI to solve all your business problems is easy.

Here’s the formula:

You post an eye-catching image and add the hook “I just solved [insert challenging task] in 5 minutes. RIP [insert profession] 🚀

People see these posts and they conclude that a whole host of other ideas must therefore also be possible. This is the peak of inflated expectations.

They try to emulate the idea at work, only to quickly discover that the claim was only “sort of” true. When real-world constraints and quality standards are applied to the recipe, the method falls short in some critical way.

  • Cue feeling disheartened.
  • Cue labelling Generative AI as all hype and writing it off for anything useful.
  • Cue pulling up the drawbridge of curiosity and deciding we don’t need to be laser-focused on AI after all.

The trough of disillusionment is reached, and a microcosmic version of the hype cycle is complete.

As we approach two years since GPT-3 went mainstream, we remain at the beginning of the very first innings for GenAI and it’s not productive to try and rush to definitive conclusions about what it all means just yet.

It’s also not productive to spread claims that AI is easy to do well; it’s not.

Getting high-quality repeatable results across your org using AI is hard.

Adding yet more software processes to your stack is hard.

Remembering you can do something differently when you’ve been doing it the same way for years, is hard.

The human in the loop

The prospect of a mature AI adoption landscape across industry wide settings – where complete workflows are delivered end-to-end appears at present to be a long way off. Not least because very few existing organisations have their project and operational data structured and prepared for such change.

But while regular businesses are not really ready, this is what large AI companies are planning for:

“OpenAI recently unveiled a five-level classification system to track progress toward artificial general intelligence (AGI) and have suggested that their next-generation models (GPT-5 and beyond) will be Level 2, which they call “Reasoners.” This level represents AI systems capable of problem-solving tasks on par with a human of doctorate-level education. The subsequent levels include “Agents” (Level 3), which can perform multi-day tasks on behalf of users, “Innovators” (Level 4) that can generate new innovations, and finally “Organisations” (Level 5), i.e. AI systems capable of performing the work of entire businesses. As we progress through these levels, the potential applications and impacts of AI will expand dramatically.” Stephen Hunter, Buckminster AI.

The “Level 4 Innovators” moment appears to be the point at which things will start to feel very different and this projection suggests we are 4-5 major development cycles from it.

While I believe such end-to-end workflows are more probable than not in the coming years, I’m doubtful that widespread automated workflows and decision-making without human oversight at critical steps would be a desirable outcome for anyone, even if it were possible. Dramatic shifts in technology over very short time periods can be ruthlessly inhumane when driven by purely utilitarian priorities – just look back at the Industrial Revolution and what became of the Luddite movement for reference.

Indeed, the “human in the loop” has been essential to every successful implementation of GenAI that I’ve seen in professional practice. Any high value professional task cannot happen without sound judgement, discretion and (seldom mentioned) good taste. People also take responsibility for outcomes – as Ted Chiang points out, the human creative mind expresses “creative intention” at every moment.

Until we see evidence that traditional “Knowledge Work” businesses can thrive and compete without this essential ingredient, then GenAI’s job will be to provide scaffolding that helps to prop up what we already do, rather than re-build it entirely.

One process at a time

In the last two years, the hype around AI has risen to unreasonable levels and every software product has been slathered in AI that no one asked for and that rarely works as desired.

But rather than trying to reinvent the wheel with wholesale departmental and budget changes caused by implementing a million new AI apps or hiring a team of software developers, my recent work with professional practice suggests that you’ll likely find it more impactful to refine just one humble spoke of the wheel at a time and to build momentum with incremental but lasting adaptations that support what you already do well.

When you focus on smaller, yet pivotal seeds of contribution from AI and put it in the many hands of your team, rather than mandating prescribed use from “on high”, you can strike a powerful balance: boosting your existing way of working while maintaining individual control, sound judgement and freedom over each productive step. Pretty soon, you will see new and varied species of work evolving across the office.

These adjustment to ‘chunks’ of work need to be done with vocal and transparent engagement about cultural alignment with this new technology.

The formation of an ethical framework for responsible use within the business -that sets guardrails for adoption – is also needed alongside new quality measures to keep standards from slipping, which is always a risk when you make something faster and easier.

You also need a programme for new skills training so that you can equip your team with the resources to go forth and explore for themselves. If Knowledge Work is going to change in the coming years, providing new skills and tools as broadly as you can seems the fair thing to do.


Find this article plus many more in the Sept / Oct 2024 Edition of AEC Magazine
👉 Subscribe FREE here 👈

Proving product-market-fit

If you’re doing it right, the productive benefits of AI will be felt by the business but largely flow via the employees who directly use it – employees may find they achieve more in the same time, to a higher level of quality and even with a greater sense of enjoyment in their work.

This is how I find it and I know many more who increasingly feel the same. Anecdotally, I know many employees now hold personal licences to various consumer-facing GenAI tools that they choose to pay for out of their own pocket each month because they bring such high levels of utility and improve their working lives directly.

That’s one of the purest examples of “product-market fit” that you will find, and it wouldn’t be happening unless people had worked out how to really drive value from these new techniques. Imagine paying for software you use for work out of your own pocket, just to improve your working life.

In these cases, mostly they don’t tell their employers – and I know this because people discuss it openly.

It’s possible that many leaders are simply turning a blind eye because it works for them to. Or this is an outcome caused by a kneejerk AI policy that was written over a year ago calling for the total prohibition of AI on company equipment. In these cases, policies are commonly being ignored and this is the worse outcome of all because then you have uncontrolled use, no privacy controls, likely data leakages of GDPR, NDA and commercially sensitive information out of the business.

Where to focus

Many hundreds of new software products have been marketed to professionals in the last couple of years – lists upon lists of “must have” new names and logos.

I have tested many and I like a number of them. But for every new and well-built tool there are many more in the ‘vapourware’ category that fail to effectively solve a real problem for practices or that duplicate something another tool has already done better.

Moreover, the sheer scale of new gadgets and techniques can cause paralysis in businesses who get caught in perpetual “test-mode” and become unable to take any meaningful action at all. The ever growing size of these lists doesn’t help either – it’s a never-ending task.

To keep things simple and to focus on what is really important there are in my mind two foundational technologies that require our special and ongoing attention with each new release of development.

  1. Image models (Diffusion models like SDXL & Flux), these can also now process video.
  2. Text models (Large Language Models like GPT-4o & Claude 3.5), these can also now process numerical data.

These two areas alone are enormously deep and novel fields of learning. I recommend getting familiar with how to access and make use of these tools as directly as you can in their “raw” form, i.e. don’t become too reliant on easy-to-use wrappers that do many things for you but ultimately reduce your control – you will find that many of the apps’ features you’ve seen are possible by focusing on the raw ingredients and with only a couple of low cost subscriptions needed.


Arko Works
Custom Styles: Arka-ArchPainter-XL, a Stable Diffusion Custom LoRA (credit: Arka Works)

Image work in practice

My background is in design, and I love to use image models for visual concepts.

I don’t find these tools can solve multistep problems in one shot – instead, I curate their use over smaller discrete chunks of controlled taskwork and weave the whole thing together using a number of well-known tools that I would already be using in traditional practice.

To give you a better sense of what I mean, in the table below you can see some practical examples from the last month at Arka Works.


Arko Works

While doing these chunks of work with teams, we’re exercising our own professional and creative judgments at each step.

What you’ll notice when looking at these approaches is that they are quite complex procedures requiring a high degree of control and aesthetic judgement. Also, that the whole process is being carefully curated by the designer, which is why some people prove to be particularly gifted at working with diffusion models by following their intuition and others are less so – if the results were just the case of clicking buttons in the right order, this wouldn’t be true.


Arko Works
Precision editing: Example workflow for a warehouse building refurbishment, created in Stable Diffusion (credit: Arka Works)

AI feels different

The introduction of GenAI to these very common practical design tasks is more akin to plugging in a synthesiser or effects-pedal into a musical instrument. In general, I prefer the “instrument” rather than “tool” analogy, which just conjures feelings of crude hammers and nails. By contrast, this instrument is nuanced, unpredictable and can make its own decisions.

My initial expectation about various AI image tools is that they would prove to be a like-for-like replacement for traditional digital rendering methods.

The reality is quite different; this is an entirely new angle with which to approach the same challenge and requires a fundamental shift in the way you think about things.

Indeed, this has been a source of debate between myself and Ismail Sileit, who says about Image Diffusion:

“While traditional rendering techniques are about faithfully replicating reality through precise algorithms, GenAI allows us to engage in a live dialogue with possibility. It’s not so much about rendering accuracy —it’s more about cultivating a relationship with the unpredictable, the emergent, and the profoundly novel.” Ismail Sileit, architect at Fosters + Partners and creator of form-finder.com (in a personal capacity)

What Ismail is getting at here is that yes — there are certainly time savings, but we shouldn’t overstate these. The main change he perceives is in the speed and breadth of the creative feedback loop itself, the new experimental avenues that we’re able to explore, and importantly the enhanced enjoyment of the whole process.

Images: what’s hard?

Despite what the “AI is easy” influencers tell you, it’s also not a simple process if you want to achieve results you can use on real projects – to exercise real control and achieve usable results in practice you have to learn quite complex interfaces and become familiar with a new lexicon of technical terms like; “denoising”, “latent space”, “seeds”, “checkpoints” and “LoRAs”, to name a few.

Getting the best out of image models also requires a strong dose of curiosity, patience, and a willingness to keep persisting in the face of abject ugliness at times.

When we’re working on this kind of thing our overall hit rate is probably less than 5%. For a recent feasibility study presentation using some very basic internal 3D renders to start things off, we produced 6no total new images across 3 design studies – but when I look back at the results, I can see it took us about 211 separate tests to get there.

While the study produced these images very rapidly, we had to put up with a great deal of mediocre and downright appalling outputs to find something that captured what we were looking for.

A 3-5% conversion doesn’t sound like a great hit rate, and also feels inherently wasteful. When you’re generating images, your GPU will be getting ready to take off and use a lot of electrical energy and this is an area I’m looking into in more detail currently to better understand the actual energy impact of using GenAI at any kind of scale. Any practice looking to adopt GenAI will probably also need to establish a means of measuring their energy use intensity and as model size increases in the coming years, this will probably become an ever greater area of difficulty.

The most effective AI strategies in professional practice that I’ve seen aren’t mandated from on high, but rather emerge organically from teams who are given the freedom to experiment and trust to exercise their individual judgement about what to do with the results

Text work in practice

AI image work really lights up the right side of my brain, but the AI technology that feels most important to me and that I utilise more than any other – by far – is Large Language Models (LLMs).

My use of text models is growing over time and I’m now at around 10-20 tasks a day for very varied and high-utility responses.

Early on I set GPT as my homepage and I tried to use it for as much as I could and even with this proactive attitude to adjust my working methods, I still spent months forgetting that it was there to help for all kinds of things during the day.

This behavioural change phenomenon may actually be the greatest hurdle in the short term to any kind of meaningful change because workflow muscle memory is strong and the cost of a failed effort during daily working life is high.

The table below shows just a few examples of the type of things we can now attempt with a fair expectation of success. With the latest release of models and the next wave just around the corner, the use-cases will be only growing in the coming months and years.


Arko Works

“Next-generation AI models are effectively embargoed until after the US election on 5th November, but expect to see significant gains in reasoning ability and intelligence when they arrive, with each of the main providers currently training / testing models more than an order of magnitude larger than the current largest & most-intelligent models.” Stephen Hunter, Buckminster AI

Text work. What’s hard?

There is no shortage of people expounding the virtues of LLMs, but let’s be honest about the difficulties.

“Multi needle reasoning” problems. These happen when you ask the model to retrieve a number of separate facts at the same time and then to apply further layers of reasoning and logic on top. When you request too many needles and too many processes, you may get a few okay results, but it’s possible to ask too much of the model at once and this can result in disappointing or incomplete results. There are ways around this that involve breaking down the task into smaller steps and just giving the model what it needs at each step, hence why you need to learn good prompt craft.

Accuracy. You still have to always spot check the results and validity of your workings. I call it spot check, because with the latest models if it’s set off on the right track it will usually get things right. Hallucinations remain a challenge, albeit the rate of occurrence has been dramatically reduced since I first started using them nearly two years ago, such that errors are increasingly hard to identify. This is why spot-checks must remain part of your core approach for now.


Arko Works
Diagram demonstrating context window size of popular models. The AI Eu Act is 116k tokens, or ~600,000 characters, or 120,000 words in length and can fit into large models with lots of room to spare (credit: Arka Works)

Long prompting is now better than using Custom GPTs for most tasks. Now that we have large context-window models available we can work with hundreds of thousands of words for input. For most tasks, pasting your prompts in long-form into a model is now a more robust approach than Custom GPTs if you need very reliable performance.

This is because Custom GPTs typically run on a smaller context window (32k) and also involve RAG, a process that cuts your data up into small chunks that are later retrieved, without needing to store them in full within the knowledge base. If the wrong chunk is retrieved, you will get a sub-par answer. Many practices have found use for Custom GPTs, but it’s now better to move towards a long-prompting method and make full use of these amazingly spacious context windows.

Data preparation. Most businesses are excited about the potential of AI, but they aren’t excited about preparing all their reusable project and operational data such that they can use it again and again to produce first draft written reports of many kinds. There is a surprising amount of written work in practice that could be massively aided by GenAI if this data were ready. We really need to start thinking about building asset libraries that are ready for LLMs to process and link together in new ways to really feel the benefits.


Arko Works
Now and Then. A popular example of a hallucination from 2022. When you ask the same question today you get excellent performance and data. However, the final sentence suggests that Thomas Andrews was both a notable survivor and also perished

Conclusion – team-led AI

The most effective AI strategies in professional practice that I’ve seen aren’t mandated from on high, but rather emerge organically from teams who are given the freedom to experiment and trust to exercise their individual judgement about what to do with the results.

This team-led approach to AI adoption, where workers take responsibility for how they wish to use it, allows for a more nuanced integration that respects existing workflows while uncovering new efficiencies and enthusiasm from within the team.

Consider building a “heat map” of opportunities within your organisation. Where are the places that these ideas really fit without trying too hard? Once you’ve identified these hotspots, tackle them one at a time. Small, incremental changes often lead to the most sustainable transformations.

AI is neither a panacea nor a parlour trick – it’s a curious instrument that, when wielded with skill and discernment, can help us work smarter, faster, and perhaps even with greater enjoyment.

We are still so early. So, I urge you to withhold judgement; be experimental and be curious.


About the author

Keir is an architect operating with one foot in architectural practice and one in the development of Generative Design and AI tools and workflows. He founded the creative consultancy, Arka Works in 2023 following a directorship at AJ100 practice Morris+Company, with a mission to prepare the profession for AI-driven change. He does this by helping architects, clients and startups to effectively apply the latest Generative Design and AI tools to the work they already do in practice, so that they can adapt to a rapidly changing professional landscape.

The post AI is hard (to do well) appeared first on AEC Magazine.

]]>
https://aecmag.com/ai/ai-is-hard-to-do-well/feed/ 0