Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Programming Technology

How Would You Improve SQL? 271

theodp asks: "It was the best of languages, it was the worst of languages. SQL's handy, but it can also drive you nuts. For example, if you want all 100 columns from a table, 'SELECT *' works quite nicely. However, if you want all but 1 of the 100 columns, be prepared to spell out 99 column names. Wouldn't it not make sense to provide a Google-like shorthand notation like 'SELECT * -ColumnName' (or DROP=syntax like SAS)? So how would you improve SQL?"
This discussion has been archived. No new comments can be posted.

How Would You Improve SQL?

Comments Filter:
  • Indexes (Score:2, Interesting)

    Right now there is no standard way of maintaining indexes. Most databases have some sort of CREATE INDEX query, but it is by no means standardized.
  • Hierarchical queries (Score:5, Interesting)

    by Bogtha ( 906264 ) on Thursday November 03, 2005 @06:38PM (#13945641)

    Extremely useful when you need to produce a result tree instead of a result list (e.g. Slashdot's nested comments). Oracle does this with "CONNECT BY", there is also a PostgreSQL patch [moonbone.ru] available. Of course there are hacks to do this, such as adding extra fields to keep track of where you are in the tree, but they are a real pain in the arse compared with using the information that's already present in the database.

  • No poetry (Score:4, Funny)

    by Fred_A ( 10934 ) <fred@NOspam.fredshome.org> on Thursday November 03, 2005 @06:38PM (#13945644) Homepage
    You can't write poetry in SQL. So it remains an inferior language compared to Perl.

    (yes, well, I ran out of ideas)
  • Check out LINQ... (Score:3, Informative)

    by 0kComputer ( 872064 ) on Thursday November 03, 2005 @06:38PM (#13945647)
    If you want to get an idea of some cool SQL improvements, check out the
    http://msdn.microsoft.com/netframework/future/linq / [microsoft.com]
    LINQ (Language Integrated Query) project for c# 3.0. Some cool stuff tht i never really thought about.
     
        For example, their select statements go backwords ie from table, select column1, n2, n3 etc... Seems kinda wacky at first, but it makes sense since you really should know what table your'e selecting from before you specify the columns.

    ex.

    public void Linq3() {
            List products = GetProductList();

            var expensiveInStockProducts =
                    from p in products
                    where p.UnitsInStock > 0 && p.UnitPrice > 3.00M
                    select p;

            Console.WriteLine("In-stock products that cost more than 3.00:");
            foreach (var product in expensiveInStockProducts) {
                    Console.WriteLine("{0} is in stock and costs more than 3.00.", product.ProductName);
            }
    }
  • by joto ( 134244 ) on Thursday November 03, 2005 @06:39PM (#13945653)
    For example, if you want all 100 columns from a table, 'SELECT *' works quite nicely. However, if you want all but 1 of the 100 columns, be prepared to spell out 99 column names.

    If this is your main problem with SQL, then you have other problems as well. Who in their right mind needs a table with 100 columns? If you have 100 columns, you seriously need to normalize your database.

    Ok, I might not be a database buff. Actually, my experience with SQL is purely academical (although I've worked with object-oriented databases). But if I were to improve SQL, my attempts would be in the direction of making it into a more pure mapping of a relational database, not in adding yet more syntactic sugar.

    • "If this is your main problem with SQL, then you have other problems as well. Who in their right mind needs a table with 100 columns? If you have 100 columns, you seriously need to normalize your database."

      You're kidding, right? I can think of a lot of reasons to have 100 columns, it simply depends on what you're working with - and it is perfectly normalized. Say you had a questionaire on a web form. Let's say it had 100 questions. Let's say you had a checkbox next to each. When the user checked the
      • by Otter ( 3800 )
        I don't work with databases, just with enormous flat text and SAS files spit out by them, but -- isn't the proper way to do that with one table with user, question# (1-100), and response (0,1) and another with question# and question?

        Your method may be technically correct (I could never get those normal things straight) but as the GP points out, it's an unwieldy, inflexible way to do it.

        • You could easily do it the way you propose - the only thing is, you now have 100 rows where before you had 1 - which I'm not saying is the wrong way... but either way you're going to have 100 columns or 100 rows. Performance wise, I'm not sure which is better or if it matters.
          • by boxxa ( 925862 )
            I think its better performance wise to have more rows than columns. I develop a lot of PHP/MySQL applications, comerically and OSS and I find that using multiple small tables with more rows seems to load and run smoother than tables with a large number of columns. Maybe something with the indexing but I have no actual proof of what works better. On the original topic, I believe that MySQL has its downfalls compared to other database systems but for me personally, I find that I can work around the problems
          • As someone else has already explained, your approach is using the database as a giant Excel table, not as a database. You could do exactly the same thing with a text file and get your performance advantage that way.
        • One day there should be a race to see if the best group of programmers and mathematicians can come up with a way to query off a text file faster than the fastest relational DB.

      • by Johnno74 ( 252399 ) on Thursday November 03, 2005 @07:10PM (#13945952)
        I can think of a lot of reasons to have 100 columns, it simply depends on what you're working with - and it is perfectly normalized

        I've got mod points but I just have to give them up to reply to this....

        You're wrong. I agree totally with the grandparent post.
        You DON'T need 100 columns, ever. If you have that many columns you should be breaking the table up into several tables with 1:1 joins. Seriously. There will always be some fields that aren't needed. Make the rows smaller by putting commonly used fields in one table, rarely used rows in the other.

        And your example of a questionaire (1 row per answer, one column per question) is not even close to normalised. What happens if there is a new question? you have to alter the schema. what happens if some questions are not answered? you'd have to have nulls, and wasted space.

        A much better structure is to have a table of questions, and a table of responses (with something like a response id, and maybe an identifier on who answered the questionare) and a question-answer table with each row pointing at a response and a question, and giving the answer that person gave for that question.
        • "You DON'T need 100 columns, ever. If you have that many columns you should be breaking the table up into several tables with 1:1 joins. Seriously. There will always be some fields that aren't needed. Make the rows smaller by putting commonly used fields in one table, rarely used rows in the other."

          I completely disagree. Breaking up the questionaire into seperate tables does nothing. I want every question every time, it's all one HTML page with checkboxes and links to the questions. If they were seper
          • by rho ( 6063 ) on Thursday November 03, 2005 @07:33PM (#13946142) Journal
            If they were seperate tables, I'd just have a lot of joins every time the page loads, which isn't going to help performance.

            That is not a SQL or database issue, it's an issue with your scripting language and Web server. This would be cached.

            You certainly can have keep the table at 100 rows. But at that point, you might ask yourself why you're using a database at all. A flatfile will probably have less overhead, even with file locking issues. Especially considering the simplistic questionaire you're using as an example--a long line of 0s and 1s would do you.

          • if you put 0's in unanswered fields, then you'll have an awful lot of zeros, and since you're using bits, that's a lot of 'no' answers, which isn't what's really there.

            any DBA worth his weight in salt knows that you ever see a single table with 100 columns then you have a major design issue.

            splitting that up into several tables in the same database will offer a significant performance increase.
          • by sheldon ( 2322 )
            Breaking up the questionaire into seperate tables does nothing.

            Well from my experience, breaking up the questionaire into seperate tables does accomplish making maintenance of the app simpler. Which depending on whether or not you want to be stuck supporting the same dumb app for the rest of your life or not, is important.

            Now you auto generate your questionaire based on a Questions table... and as an added bonus when someone decides they really need to ask Question #101, they can input it through a mainten
          • No wasted space and every column is relevant in my example. EVERY time I want EVERY column, no exceptions.
            Then your example is contrived. In the real world somebody would ask, "How many people checked item 100?" or "How many people who checked item 33 also checked item 99?"
        • That it's possible to work around the problem doesn't mean that the problem is invalid.

          Yes, you can come up with a way that you can organize data and tables so that this problem is easier to solve, but there's certainly a way that the data and the language can be made so that you don't have to fiddle with that sort of crap.

          And that said, someone may have an instance where, for whatever reason, the data absolutely should be laid out in a table that large. They shouldn't be punished for using the system in th
        • A much better structure is to have a table of questions, and a table of responses (with something like a response id, and maybe an identifier on who answered the questionare) and a question-answer table with each row pointing at a response and a question, and giving the answer that person gave for that question.
          An answer has to point to a specific question and makes no sense with just an answerid and a userid, so you should have the questionid in the answers table, it's a nonsense to asssociate them in a
      • by HawkingMattress ( 588824 ) on Thursday November 03, 2005 @09:31PM (#13946926)
        The schema you're talking about is wrong, wrong, wrong. You've made a schema based on how you want to display the information, not based on what this information is carrying.
        How can you make a request to find which user responded to more than 3 questions, for example ? you'd have to test each column individually, and then you'd have to modify the request if you add or remove some columns. Furthermore, you have to change the schema itself each time you add or remove a question.
        And finally, you don't use the relationnal part of the database, at all. Your schema supposes that there is a question table, with a questionid. And in your anwsers table, you have columns named after the ids, right ? Now how can the database check if you delete a question that there are no answers pointing to it ? it can't, because you're using the database as if it was a flat file. So the database can't check data integrity for you, because your schema isn't normalized at all. as Johnno74 says, you need at least a question table, a response table, and an user/question/response association table. If you don't do that, the database has zero advantages over a flat file.
        • Yes, my example is wrong for what *you* think I should do with it, but NOT wrong for what I use it for. Again, I don't need the database to check the integrity of anything of me. I won't be adding columns. It's as simple as that. Back to the original point - when somebody says a table with 100 columns is done wrong, they aren't factoring in every possible use for one. Am I using this specific table like a flat file? Yes. There is no rule that says I can't. It doesn't mean the schema doesn't work for th
    • Why do I need 99/100 columns? Perhaps a poor example, but the obvious answer is: Because some nitwit built this database and I just have to live with it.

      Do you think most SQL queries are executed by people who created the database in the first place, or even have any control over how it is designed? Don't be silly. It's the usual problem with software. Computers are so flexible, and commercial software is so inflexible, that a huge amount of energy is expended trying to get data from point A to point B when
  • ugh... (Score:5, Insightful)

    by Anonymous Coward on Thursday November 03, 2005 @06:39PM (#13945657)
    Did you know there have been people working on a general algebra for data management for, what 40 years now? Did you know, this is basically a SOLVED PROBLEM? Ever heard of "D"? Or Tutorial D? The Third Manifesto?

    You know, I totally understand why Fabian Pascal [dbdebunk.com] is always pissed off.

    Here's something for you to chew on:

    Why do programmers write this:

    A + 3

    when they want to add 3 to A? Why do we not write some lovely crap like:

    OPERATE ON A WITH 3 USING ADDITION

    why do we write:

    (A + 3) * 2

    and not:

    OPERATE ON (OPERATE ON A WITH 3 USING ADDITION) WITH 2 USING MULTIPLICATION

    Why do we do that?? Because algebraic notation is 1) declarative .. it tells the computer what you want, it doesn't tell it how to do it and in what order, and 2) algebraic notation is *completely general*. You can nest arbitrarily with parentheses, and you can clearly see what's a variable and what's a value and what's an operator. Easy to create, understand, and *optimize*.

    Do you agree with me that the verbose syntax clouds your thinking? Keeps you from seeing the underlying operations? Makes it difficult to apply the basic algebraic skills you learned in high school? Makes it difficult for the compiler writer to do the same?

    Now I ask you, why do we write:

    SELECT * FROM Order

    and not

    Order

    Why do we write:

    SELECT * FROM Order JOIN OrderItem WHERE Order.order_id = OrderItem.order_id

    and not

    Order JOIN OrderItem

    And here you are, dwelling on some little detail about projecting columns.. this is an easy one: use an "ALL BUT" operator for example:

    RelvarWith100Attributes ALL BUT (Attribute100)

    Once you see that relational algebra is just values, variables, and operators nested in any arbitrary way, just like arithmetic, you have opened the door a little more to understanding the fundamental theory of data management and how backwards and primitive "modern" data management is.

    And let's not even get into all the crap that SQL gives us like duplicate rows, NULLs, brain-dead table-oriented storage, lack of 100% updateable views, lack of arbitrary constraints, (often) lack of composite types (why the hell do we splat objects into MULTIPLE COLUMNS?? They should be stored in ONE column). SQL also confuses logical and physical layers (keys vs. indexes), and has basically kept the database industry in the dark ages for decades now.

    So the answer to your question is pretty simple: I would ditch SQL and use something that looks like relational algebra, which has been understood and documented for a probably longer than you've been *alive*. No offense.
    • I don't disagree with a thing you say. However, I have to ask you - what's more likely? That the entire world ditches SQL and moves to a better system? Or that a new, updated version of SQL includes fixes for some of its most heinous sins? I think the answer clearly is the latter, which leads back to the question - what should we do to fix SQL?
    • Re:ugh... (Score:5, Insightful)

      by rho ( 6063 ) on Thursday November 03, 2005 @08:01PM (#13946365) Journal
      SELECT * FROM Order JOIN OrderItem WHERE Order.order_id = OrderItem.order_id and not Order JOIN OrderItem

      Why do you assume Order and OrderItem will be joined on order_id? They don't have to, you know. So you have a "Order JOIN OrderItem ON order_id" format. Except, why do you assume both columns will have the same name? So you now have a "Order JOIN OrderItem ON order_id AND order_id_submitted" format.

      And before long, you've got an equally baroque language for describing a query as SQL is now. SQL's point is to have something reasonably human-readable. It's amazingly flexible, and easy for beginners to pick up with simple queries. Your New and Improved query language had better be leaps and bounds ahead, not just simpler to type.

      (Especially since probably 90% of queries are written only once, and stored for future use in a script or as a stored procedure. Building a query can be arduous for complex data, but it doesn't have to be typed every time you use it.)

      • I found SQL very hard as a beginner. I learned "connect to mydatabase user oyenstikker using lousypassword". Fine. I sounds like English. So I remembered the thoughts and not the syntax. The 2nd, 3rd, 4th, 7th, 12th, and 18th time, I typed "open mydatabase user oyenstikker password lousypassword", "connect mydatabase as oyenstikker with lousypassword", or something like that. If the language is an abstraction, it is easier to remember the syntax.
      • The relation between tables should be encoded in the database metadata via foreign key relationships, so in the simple and most common case you do not need to specify column names for joins.
        • by rho ( 6063 )
          Agreed, but you must still take into account queries that have little or nothing to do with the foreign keys. SQL is flexible in that way--SQL's replacement must be flexible in the same way, and at that point, what benefit is SQL's replacement bringing?
    • by RingDev ( 879105 ) on Thursday November 03, 2005 @08:23PM (#13946509) Homepage Journal
      Lets look at something a little more realist:

      SELECT
        Lease.LeaseNum,
        Lease.LesseeNum,
        Invoice.InvoiceNum,
        Invoice.AmountBilled
      FROM
        Lease INNER JOIN
        Invoice ON
          Lease.LeaseNum = Invoice.InvoiceNum
      WHERE
        Lease.LeaseNum = "1234"
      ORDER BY
        LeaseNum, InvoiceNum

      Okay, that's pretty big to get some basic lease and invoice info. Now how you you write that?

      Lease.LeaseNum,
      Lease.LesseeNum,
      Invoic e.InvoiceNum,
      Invoice.AmountBilled
      Lease JOIN
      Invoice ON
        Lease.LeaseNum AND Invoice.LeaseNum
      Lease.LeaseNum = "1234"
      Lease.LeaseNum
      Invoice.InvoiceNum

      ??? All that's been accomplished is the removal of key words. I'm not seeing any benefit, and I'm seeing the pitfall of it being hard as hell to read.

      -Rick
      • by Anonymous Coward
        You're missing the point. You really have to study the theory, and you'd get something like this (the exact syntax is unimportant of course):

        ((Lease JOIN Invoice) WHERE LeaseNum = "1234")
        [LeaseNum, LesseeNum, InvoiceNum, AmountBilled]
        ORDER BY whatever

        Why do I put the column names at the end? Because a projection operation applies to a *single relational result*, not to individual tables.

        Why do I not qualify every column name? Because relational attributes *must* always be unique and unambiguous.

        Why do I lea
        • Why do I not qualify every column name? Because relational attributes *must* always be unique and unambiguous.

          Result, C naming syndrome.

          LeaseNumber and Lease.Number are functionally equivalent, they're both unique, however the latter lends itself to convenient simplification in contexts where it wouldn't create ambiguity. Otherwise you end up repeating yourself a lot. LongTableNamePrefixFoo, LongTableNamePrefixBar, LongTableNamePrefixBaz etc.
    • Re:ugh... (Score:3, Funny)

      by lscoughlin ( 71054 )
      That's cute.

      Now start inserting and updating.

      maybe joining.

      yeah -- sql is stupid, sure. But you're proposal here really isn't any brighter.
  • I'd *love* it if "group by *" (or some such syntax) would just group by all the items in the select in order... It's so bloody tedious to have to re-write each item being selected for the group clause.
    • you are asking for duplicate elimination as in relational algebra.

      easily achieved in SQL by using SELECT DISTINCT instead of SELECT.

      i guess what you really mean is "group by everything which is not explicitly in some form of aggregation in the select statement" which is a more special case, ironically one that fits exactly into the "* except x" pattern of the original poster.
      • Only, a "select distinct" won't give me the "count(*)" (which I want).

        select fielda, fieldb, fieldc, fieldd, count(*) from foo where status=1 group by fielda, fieldb, fieldc, field;

        Just a personal gripe I know, but it seems redundant to tell it what to group by when 99% of the time it's the same order as the fields are selected by...
  • Using stored procedures or triggers in a database is a sure way to tie yourself to a particular database platform. For appications that have to be cross-platform, I usually try very hard to avoid triggers and stored procedures.

    It would be great if there was a common language for store procedures. The RDBMS may support many languages, but that scripting language would be available on most platforms since it was part of the SQL standard.

    Another feature missing is optimization hints. Oracle uses special S

    • There is a common language for stored procedures / triggers. SQL Persistent Stored Modules (SQL/PSM) is defined in ISO/IEC 9075-4:2003 [iso.org] and is currently implemented by at least IBM DB2, MySQL 5 and Mimer SQL. An implementation effort for PostgreSQL / EnterpriseDB is under way.

      If your database does not support SQL/PSM and is not working on implementing it you have only your choice of database to blame for that.
  • Replace it by Lisp (Score:3, Interesting)

    by RAMMS+EIN ( 578166 ) on Thursday November 03, 2005 @07:06PM (#13945908) Homepage Journal
    I'd replace it by a special-purpose Lisp, and compose it like s-expressions. Mix and match query elements in a flexible manner, yet never run the risk of injections, because it all happens in a structured way. I've done things like this on a small scale (contact information database), and it works really nicely.
  • by klui ( 457783 ) on Thursday November 03, 2005 @07:08PM (#13945930)
    No standard way to extract/load data. No standard way to get all tables in a database. Basically DDL is entirely separate and each type of database has its own way of doing things. Let's not talk about embedded SQL and optimizing queries (like Oracle hints.. *ugh*).
  • Dates (Score:4, Interesting)

    by omibus ( 116064 ) on Thursday November 03, 2005 @07:30PM (#13946122) Homepage Journal
    1. Standard date functions and handling.
    2. Allow for SELECT statement reordering. I should be able to have the FROM first. This would be a BIG help to SQL editors!
    3. Column aliases. So if I have a column in the select that is ColA+ColB as "My Value", I can use the "My Value" in the WHERE, GROUP BY, and ORDER BY instead of having to restate the equation every time.
  • by yamla ( 136560 ) <chris@@@hypocrite...org> on Thursday November 03, 2005 @07:33PM (#13946141)
    It always bugs me that you write UPDATE and INSERT totally differently. I'd much rather see them essentially the same (update obviously would need a WHERE clause). Not a big deal, I admit, but it makes my life harder.

    Fundamentally, though, what we need is much better interfaces to our applications. Having to convert C++ data into some format appropriate for SQL and back again is a pain in the behind. Every existing interface between the two that I've seen is crap. Heck, most of the C++ interfaces don't even let you use the std::string type. What are we, back in 1995? Forget vectors and maps, or the fancy boost multiindexed templates. Anyway, these really aren't problems for SQL to solve. And yes, there are object-oriented databases and the like. But relational databases are still pretty much universal. I just mention it because it is a pain in the behind.
    • I have to tip my hat to that suggestion. It's never made sense. I don't do any intensive database work but the stuff I do, it's annoying to have to write out the insert/update queries differently in the same function.
  • Infuriating (Score:4, Insightful)

    by TTK Ciar ( 698795 ) on Thursday November 03, 2005 @07:38PM (#13946176) Homepage Journal

    Out of all the annoying issues, I've pulled out the most hair over unique id's, and INSERT vs UPDATE.

    Most SQL implementations give you some way of assigning unique id's to newly INSERT'ed rows. It would be nice if there were a standardized way, but that's a side issue. Once rows have unique id's, you can identify rows to be updated by id. This is very fast and simple.

    Except .. in order to find out what id the DBMS has assigned a row, I usually have to follow my INSERT with a SELECT, to read the id column. Slow and annoying. Sometimes the DBMS I am working with takes a few seconds to perform the INSERT, and ten minutes to perform the SELECT. That takes it beyond an optimization issue, and into a workability issue.

    Also, if I do not yet know if a data record has been INSERT'ed, and I need to either UPDATE the existing record or INSERT a new one (say, with just a new timestamp), then I need to either attempt an UPDATE and then fall back on INSERT if the UPDATE fails (ew!) or attempt a SELECT and either INSERT or UPDATE depending on whether it returned any rows (ew!).

    If SQL came up with a standardized way to associate unique id's with newly INSERT'ed rows, it would be very, very nice if the id column(s) assigned were returned to the client in the same packet as the message confirming that the INSERT succeeded. Nearly zero additional overhead, neat, fast, and easy.

    To solve the UPDATE/INSERT issue, I'm less sure. Say, for instance, that I have a daemon which periodically scans the filesystems in a cluster of machines, and it wants to UPDATE the "exists" column of a given row identified by a ( hostname, mountpoint, path, filename) tuple with the current time, if that row already exists, or INSERT a whole new row for that file if it does not exist. Perhaps there could be a "WRITE" command which is just like INSERT but overwrites a row if it already exists? That seems like the wrong solution, too. In the meantime, I play with caches of hashes to unique id's and lose more hair.

    -- TTK

    • Re:Infuriating (Score:3, Informative)

      by Forbman ( 794277 )
      Well, of course it all depends on the database. For those DBs that use autogenerated field types (i.e., SERIAL in PostGres, AUTOINC attribute in many others) getting the generated ID for the record you inserted is...problematic at best. Your point is valid.

      For other DBs that can use triggers and sequence generators (Oracle, PostGres, Interbase/Firebird), it can be a bit easier, as these DBs have ways to query the sequence generator for its current or next value (as long as you're still in the same transacti
      • For those DBs that use autogenerated field types (i.e., SERIAL in PostGres, AUTOINC attribute in many others) getting the generated ID for the record you inserted is...problematic at best.

        For Postgres, this is patently false. [postgresql.org] You can get the value of a sequence (serial) either before or after your INSERT.
        • the link points to a few rows that are very very very incompatible with the expression threading and multiple processes :)

          however if you do some locking and synchronization, it can be used .... it would look a bit better if it could work in 1 transaction so no locking would really be needed like this :

          >begin
          >allocate_id -> x
          >insert (x,'fishy')
          >commit

          if commit won't come, the x value will be released again, if a another process between this processes allocate&insert comes by, it just gets
    • by MobyDisk ( 75490 ) on Thursday November 03, 2005 @08:43PM (#13946643) Homepage
      I have yet to encounter a DBMS that didn't have an efficient, straightforward way to get the ID after an insert, but YMMV. However, the INSERT/UPDATE issue is a fundamental syntactical problem and it really should be fixed. INSERT and UPDATE do almost the same thing, yet have completely different syntax.

      INSERT INTO someTable (fld1,fld2) VALUES ("foo","bar")
      UPDATE someTable SET fld1=foo, fld2=bar

      It is REALLY annoying when you have to write some code that generates a SQL statement because you must code for two completely different syntaxes. Someone replied about the Oracle MERGE which seems like a nice way to go.

      Fortunately though, there are lots of good frameworks around SQL that make it so that writing SQL is becoming a thing of the past. I would like to see SQL treated like HTTP - nobody writes HTTP. It's a protocol. Let it be. Just use the tools. I guess it will never go away though...

    • Perhaps, instead of replacing SQL, let's just add a clause to the update statement (or insert statement):

      insert into table tabname (field1, field2) values (1, 2) [ WITH UPDATE [ALL,FIRST,ADD,NONE] ON FIELD1=1 ];

      Thus, if the row exists with field1=1, it will update that row to with value field2=2.
      If there are multiple rows where field1=1, define either update all of them (uses 'ALL'), update the first one found (uses 'FIRST'), insert one more (uses 'ADD'), or updates none of them (uses 'NONE').

      Alternately, y
    • Re:Infuriating (Score:2, Informative)

      Maybe only Oracle does this?:

      insert into tab (col1, col2, col3, ...) values (v1, v2, v3, ...)
      returning expr1, expr2, ...
      into var1, var2, ...

      or

      update tab
      set col1=v1, col2=v2, col3=v3, ...
      returning expr1, expr2, ...
      into var1, var2, ...
    • Except .. in order to find out what id the DBMS has assigned a row, I usually have to follow my INSERT with a SELECT, to read the id column. Slow and annoying.

      What database requires this? Every database I've ever used has an efficient, documented method of getting the automatically generated ID.

      To solve the UPDATE/INSERT issue, I'm less sure.

      MySQL has this:

      INSERT INTO table (...) VALUES (...) ON DUPLICATE KEY UPDATE column=...

      It also lets you use UPDATE syntax for INSERTs:

      INSERT INTO table SET name=...

      In g
    • some solutions (Score:2, Informative)

      by Matje ( 183300 )
      fwiw a few solutions.

      in MySQL, the statement REPLACE INTO will perform an update or an insert, depending on whether the primary key value exists in the table. It performs exactly like your WRITE command would.

      in MySQL, you can perform a SELECT LAST_INSERT_ID() to get the last inserted value.
      in MSSQL, use a SELECT @@IDENTITY to get the same. (check in the docs whether you need @@IDENTITY OR @SCOPE_IDENTITY or the third version, I always forget).
  • My complaint about it when I first started learning sql (ms flavor, only way offered at school at the time) and got to wildcards, where someone writing the spec chose % instead of * and so on. To top it off, I use sql so infrequently (months between writing it, then only a few statements) that this difference bites me in the butt just about every time I try to use a wildcard.
  • I feel lucky to have not been burned by this, but you can bet a lot of people have. If you really want to delete everything, you should have to type something like "delete from table where true" or "truncate table".

    The SELECT syntax is usually good enough for me, but the capabilities of most implementations leave much to be desired. They only support a small subset of what the syntax suggests.
    • I agree wholeheartedly. My main beef with SQL is that it is optimised for doing the wrong thing in so many cases:

      • It's easier to select all rows from a table than it is to select a specific one.
      • It's easier to select all columns from a table than it is to select the subset you are interested in.
      • It's easier to delete all rows in a table than it is to delete a specific one.
      • It's easier to clobber all rows with an update than to update a specific one.

      One thing I do more often than anything else operating o

  • mind reader (Score:4, Funny)

    by chochos ( 700687 ) on Thursday November 03, 2005 @09:16PM (#13946837) Homepage Journal
    SELECT * FROM whatever the hell my customers have in mind

  • I've never been a fan of the join syntax. 'Inner' and 'outer' and 'left' and 'right' and 'full' joins? Let's for gosh sakes use the simple set-theory math terms we all understand.

    Instead of:

    select A.f1, b.f1, b.f2 from A join B on A.f1 = b.f1

    do:

    <B>select A.f1, b.f1, b.f2 from intersection join of A, B on A.F1 = B.F1
    select A.f1, b.f1, b.f2 from union join of A, B on A.F1 = B.F1
    select A.f1, b.f1, b.f2 from union minus intersection join of A, B on A.F1 = B.F1
    select A.f1, b.f1, b.f2 from union minus A

  • USE Database;
    SELECT * FROM table WHERE column !="99";

    Did I get the job? Or for that matter DID YOU?
  • But I've been getting into MDX lately (data cube technology). I'll tell ya, if you want to do super-fast reporting of hierarchical data, cubes are the way to go. And MDX is a very feature rich cube querying language. It's such a breeze to do a query in MDX that would take all kinds of sub-queries, inner/outer joins and calculated fields to do with SQL. If you're finding yourself wracking your brain with complex SQL for reports you should really check out cubes and MDX. The learning curve will be well worth
  • The biggest thing needed is the ability to return rows of differing column makeup for object oriented purposes. You need to be able to go:

    SELECT FROM base-class

    and get back all the fields from sub-classes so that you can populate entire objects from them. Real object oriented databases let you do that.

  • Most SQL dialects include some sort of exclusion operator.

    SELECT * FROM A INTERSECT SELECT * FROM A LIMIT 99;

    or

    SELECT * FROM A EXCEPT SELECT "B" FROM A;

    Other engines do it differently. I think one of the best things about SQL is that it's a loose standard. You can easily choose the engine that works best for you... unless you are from the Cult of Microsoft (SQL Server). DB/2, Oracle, and even Sybase have very cool features that make queries much more powerful.

    While SQL is hard to use at times (remembering d
  • We often run into the problem of wanting to copy a row in a table back into the same table (or an identical table in another database) but only make a minor modification to it. For example, let's say I have a system for creating orders. When I want to generate an invoice, I want to copy all the rows from that order into an identical invoice ("you ordered three widgets at $1.99"). The reason to preserve the original order is that you may want to reuse it to generate more invoices later.

    So what I'm looki

  • The worst part about SQL is that it was designed at a time when there was this notion that you should be able to read computer languages like English.

    As we all know, unfettered application of this principle leads to some terrible languages (see BASIC, COBOL). SQL is one such language: every extension, every new functionality that looks "sql-like" (eg, not a user function) has to be integrated into the grammar. Furthermore, although perhaps not oft considered by SQL veterans, there are key words that, in a
    • Specific to the poster's problem: It would also be easy to provide a macroing system that would allow you to "invent" functionality originally not thought of, such as negative masking column names. Once you open up macroing, you have the full power of metaprogramming, and nothing is beyond your reach.
  • At the C2 wiki some of us have been thinking about this question for years. Here are some key topics on it, including my pet SQL replacement suggestion:

    http://www.c2.com/cgi/wiki?SqlFlaws [c2.com]

    http://www.c2.com/cgi/wiki?TqlRoadmap [c2.com]
         
  • It seems stupid to me that insert would have a different format from update.

    e.g. (field1,field2,field3) values ('1','2','blah') vs field1='1', field2='2', field3='blah'. Kinda dumb.

    Plus why wasn't there a command to "merge/replace/put" rows into a table much earlier?

    SQL's badly designed, but looks like we'll have to live with it for decades to come.
  • Standardization (Score:3, Insightful)

    by JediTrainer ( 314273 ) on Friday November 04, 2005 @10:52AM (#13949818)
    All I want is for every database to have the same functions; a standardized way to do the basics.

    Every vendor seems to have their own ways to define (or arbitrarily break standard) date functions (add/compare/convert/get current timestamp), string manipulation (like uppercasing, substrings and concatenation), getting the generated id from an inserted row (identity/serial/auto_increment), limiting the number of rows returned (TOP or FIRST?), getting a subset of rows (ie a standard way to get rows 100-150 that works with most DBs) or even getting a list of tables or viewing the schema.

    Trying to make an app portable across DBs is next to impossible, and that's not even counting stored procedures or different behaviours for the same syntax (like NULL handling across the various functions). This is very irritating and should have been fixed long ago. Instead, we get this crap which makes the differences we see across different web browsers look like child's play.

"What man has done, man can aspire to do." -- Jerry Pournelle, about space flight

Working...