Qwen3 235B-A22B: Deep Local Testing – Biggest Qwen Yet!

16
41

Exploring the Quen 3235B-A22B Model: A Comprehensive Review

Unveiling the Quen 3235B-A22B

In the rapidly evolving realm of AI, the release of the Quen 3235B-A22B model has captured significant attention. Regarded as the flagship of the Quen 3 lineup, its launch sparked excitement among tech enthusiasts, including myself. In this article, we’ll delve into the model’s unique features, performance metrics, and overall capabilities.

Setting Up the Adventure

After some initial hiccups with my camera, I was eager to test the Quen 3235B-A22B on my high-powered setup featuring dual 3090Ti GPUs and a robust Intel i7 processor with 128GB of RAM. This model claims to rival elite contenders like the Deepseek R1 and the Gemini 2.5 Pro—a compelling assertion that set high expectations.

The initial setup was challenging, as finding the right settings took considerable time and effort. Ultimately, I settled on a context length of 4,096 tokens, which seemed to deliver satisfactory results. Despite a slow processing rate of around 2.5 to 3 tokens per second, I was thrilled to run it locally, enabling a hands-on experience.

The HTML Test

I began with a straightforward HTML website test, instructing the model to generate a website for “Steve’s PC Repair.” Upon completion, I was pleasantly surprised by its ability to create functional, albeit basic, designs. It featured hover effects and incorporated credentials like "Certified CompTIA A+"—a curious decision, but indicative of the model’s capability to produce relevant, if not entirely authentic, content.

The website included practical features such as a contact form, a footer with the year 2023, and a list of services, though the overall design felt merely adequate, lacking in pizzazz. I had anticipated a more visually stunning output, which left me wanting more.

Transition to Retro Gaming

Shifting my focus, I decided to explore its coding capabilities by generating a retro game titled "Neon Drift." Although the game functioned without major errors, its simplistic design and color choices left much to be desired. I appreciated the straightforward execution but was left contemplating its potential.

In my quest for improvement, I decided to pivot toward another game type—this time, a Pong clone. After reworking the JavaScript files, the game produced better results, albeit with some remnants of the previous project. The excitement of seeing tangible outputs kept me engaged, revealing both the limitations and strengths of the Quen 3235B-A22B in code generation.

Crafting a Stunning Website

Not one to settle for mediocrity, I endeavored to push the boundaries by requesting an aesthetically pleasing website for Steve’s PC Repair that could rival Fortune 500 entities. The model responded with a more intricate design, showcasing additional features and efforts, although it still lacked the full visual sophistication I yearned for.

Curiously, the model got the footer year right—2025—showcasing an insightful quirk that appeared throughout many of my interactions with it. It seems the variations between its “thinking” and “non-thinking” modes caused fluctuations in accuracy and creativity.

The VC Pitch: A Creative Challenge

Perhaps the most intriguing endeavor was asking the model to generate a pitch for a fictional product called "Notify." The model excelled in creating a professional-looking presentation, right down to the marketing jargon that would typically lure investors in.

However, the quirkiness of its responses raised eyebrows—like suggesting mock data for traction or packages for download. While amusing, these inaccuracies highlighted the model’s growing pains.

Conclusion: A Mixed Bag of Potential

The Quen 3235B-A22B certainly demonstrated promise but also showcased several shortcomings that hindered its ability to wow. Overall, it performed satisfactorily for website generation and coding exercises, yet it struggled to match the sleekness and functionality of leading models.

In summary, while I enjoyed my testing experience, the Quen 3235B-A22B felt like a stepping stone. It shows how the landscape of AI continues to evolve, raising questions about its future iterations. For those interested in diving deeper into the world of AI coding and web generation, the Quen 3235B-A22B may not be perfect, but it serves as a notable entry point. Your thoughts and questions are welcome, as we continue exploring the frontiers of artificial intelligence together.



source

16 COMMENTS

  1. Aside from the meme-worthy thinking time (100 minutes for a snake game), the quality is excellent. The code quality I emthisized was superb (methodical, clean code that is better than most software deveopers). The only bug I found is that despite the reset method being implemented correctly, it didn't have a key bound to it.

    The GPU offloading doesn't seem to be working properly was RAM bandwidth is still the bottleneck (and it's VERY consistent). This will absolutely fly when those 22 billion paramiters are located on GPU's VRAM.

    Plus it [should] be possible to run models that don't fit into memory. only loading the experts into memory as they're needed. So both richer and faster.

    Oh, and it doesn't seen to be implemted/ documented too well, is that Qwen3 can also sideload a smaller model for 'draft' calculations.

  2. Deepeek 3.1 is still better for me. See this response from both (Deepseek's is clearly better)

    Q: Give me expert tank tips for mlbb

    Qwen3-235 :
    Perfect Your Rotation : Smart tank rotations and map control are essential to turn the game around. Focus on supporting your team by rotating effectively between lanes and assisting in securing objectives

    .

    Prioritize Teamfight Engagement : As a tank, your role is to initiate fights or disrupt enemy formations. Use your crowd control abilities wisely to create opportunities for your team to secure kills or push advantages

    .

    Itemization Matters : Choose items based on the enemy team’s composition. Build magic resistance if they have heavy magical damage, or armor if facing physical threats. Flexibility in item builds will make you a more effective frontline

    .

    Map Awareness : Always keep an eye on the minimap to track enemy movements. This helps you warn your team of potential ganks and decide when to roam or assist in securing buffs, dragons, or other key objectives

    .

    Positioning and Timing : Avoid overextending and know when to disengage. Proper positioning allows you to absorb damage while protecting your backline carries. Timing your abilities can be the difference between winning and losing a team fight

    .

    Master Your Hero : Learn the strengths and weaknesses of your chosen tank. Each tank has unique abilities, so mastering your hero's kit will help you maximize impact in-game

    .

    Communication and Teamwork : Coordinate with your team to ensure cohesive plays. Good communication can lead to better objective control and more efficient teamfights

    ——————————–

    Deepseek 3.1:

    1. Role Understanding

    Peel for your carries – Protect your damage dealers (MM, Mage) instead of over-chasing kills.

    Initiate wisely – Don’t engage without backup unless you’re sure your team can follow up.

    Zone enemies – Control key areas (bushes, objectives) to deny enemy vision and positioning.

    2. Map Awareness & Rotation

    Roam efficiently – Always check the minimap and rotate to lanes that need help.

    Secure vision – Use bushes to scout enemies and alert teammates of ganks.

    Prioritize objectives – Help secure Turtle, Lord, and Towers instead of pointless fights.

    3. Itemization Tips

    Adapt your build – Adjust items based on enemy composition:

    Magic Damage? → Athena’s Shield, Radiant Armor

    Physical Burst? → Blade Armor, Antique Cuirass

    Heavy CC? → Tough Boots, Oracle

    Team utility items – Dominance Ice (anti-heal), Immortality (late-game), Guardian Helmet (sustain).

    4. Crowd Control (CC) Timing

    Chain CC – Coordinate stuns/slows with your team to lock down key enemies.

    Save key skills – Don’t waste your main CC (e.g., Franco’s hook, Tigreal’s Flicker+Ult) randomly.

    Disrupt enemy combos – Use knock-ups or stuns to cancel channeling ults (e.g., Pharsa, Odette).

    5. Advanced Techniques

    Flicker + Ult plays – Works well with Tigreal, Atlas, Minotaur for surprise engages.

    Bodyblocking – Block skillshots (e.g., Franco’s hook, Selena’s arrow) for your carries.

    Baiting enemies – Pretend to be out of position to lure enemies into a trap.

    6. Communication & Mind Games

    Ping wisely – Use “Attack!” or “Retreat!” to guide teammates.

    Psychological pressure – Show presence in bushes to scare enemies away from objectives.

    Track enemy spells – Keep an eye on enemy Flicker/Purify cooldowns before engaging.

    7. Late-Game Priorities

    Stick with the carry – In late-game, one mistake can lose the match, so prioritize protecting your damage dealers.

    Sacrifice if needed – If you can save your core by dying, do it.

    Best Tanks for Different Situations

    Early Game Pressure: Hylos, Grock

    Teamfight Engage: Atlas, Tigreal, Minotaur

    Anti-Dive/Peel: Lolita, Khufra

    Roam Dominance: Franco, Kaja

    ——————————————-

  3. Got this running on a M4 Max 128GB but in Q2_K it was ok. 18+ token / sec. These MOE / sparse models are great for Apple Silicon, big VRAM and slow GPU. But I think I am better of running the smaller MoE in MLX 30B A3B model, not sure how Q2 quality is for 235B A22B. About to build a dual 3090 as well, but I think I will use that for dense models or image/video AI workloads.

  4. the way i see things with the Steve's PC repair thing is that it's best that the model makes as few assumptions as possible because if you're using a model to make a website you want the model to tailor that website to you. While GLM for example being able to make a great looking site with no style instructions is fantastic it's also important that it's able to follow the style guidelines you give it when developing a site in real world use cases. I think its ability to follow style guidelines and instructions is somewhat more important than its ability to generate something that looks great (rather than simplistic) with no style instructions. Great video though, and I appreciate the variety of tests you do!

  5. I got it running with a single P40, much less powerful than your graphic cards, and is giving me around 4.5 tok/sec, I think you should get something better than me. I am running with a 128K context. The differences I can see with your setup are having the K and V Caches quant type set to Q8_0, offload 18 layers, CPU thread pool size set to 8 (in load and inference), and that I have a 5950X as CPU, not overclocked.I am also running an Ubuntu derivative, but before executing LM Studio I set this env variable:
    export GGML_CUDA_ENABLE_UNIFIED_MEMORY=1 (you may need resizable BAR for it, not sure).
    I don't know if it might be of any help, 4.5 Tok/sec is still unusable.

  6. Great work testing Qwen3 immediately after release! I throughly enjoyed the videos.

    I actually like the 32B model over the 235B-A22B model. It seems like having more active parameters (32B) improves the amount of knowledge in the parameters.