如何在Python中将一个字符串附加到另一个字符串?

发布于 2021-02-02 23:09:05

除了以下内容外,我想要一种有效的方法来在Python中将一个字符串附加到另一个字符串。

var1 = "foo"
var2 = "bar"
var3 = var1 + var2

有什么好的内置方法可以使用吗?

关注者
0
被浏览
122
1 个回答
  • 面试哥
    面试哥 2021-02-02
    为面试而生,有面试问题,就找面试哥。

    如果你仅对一个字符串有一个引用,并且将另一个字符串连接到末尾,则CPython现在会对此进行特殊处理,并尝试在适当位置扩展该字符串。

    最终结果是将操作摊销O(n)。

    例如

    s = ""
    for i in range(n):
        s+=str(i)
    

    过去是O(n ^ 2),但现在是O(n)。

    从源(bytesobject.c):

    void
    PyBytes_ConcatAndDel(register PyObject **pv, register PyObject *w)
    {
        PyBytes_Concat(pv, w);
        Py_XDECREF(w);
    }
    
    
    /* The following function breaks the notion that strings are immutable:
       it changes the size of a string.  We get away with this only if there
       is only one module referencing the object.  You can also think of it
       as creating a new string object and destroying the old one, only
       more efficiently.  In any case, don't use this if the string may
       already be known to some other part of the code...
       Note that if there's not enough memory to resize the string, the original
       string object at *pv is deallocated, *pv is set to NULL, an "out of
       memory" exception is set, and -1 is returned.  Else (on success) 0 is
       returned, and the value in *pv may or may not be the same as on input.
       As always, an extra byte is allocated for a trailing \0 byte (newsize
       does *not* include that), and a trailing \0 byte is stored.
    */
    
    int
    _PyBytes_Resize(PyObject **pv, Py_ssize_t newsize)
    {
        register PyObject *v;
        register PyBytesObject *sv;
        v = *pv;
        if (!PyBytes_Check(v) || Py_REFCNT(v) != 1 || newsize < 0) {
            *pv = 0;
            Py_DECREF(v);
            PyErr_BadInternalCall();
            return -1;
        }
        /* XXX UNREF/NEWREF interface should be more symmetrical */
        _Py_DEC_REFTOTAL;
        _Py_ForgetReference(v);
        *pv = (PyObject *)
            PyObject_REALLOC((char *)v, PyBytesObject_SIZE + newsize);
        if (*pv == NULL) {
            PyObject_Del(v);
            PyErr_NoMemory();
            return -1;
        }
        _Py_NewReference(*pv);
        sv = (PyBytesObject *) *pv;
        Py_SIZE(sv) = newsize;
        sv->ob_sval[newsize] = '\0';
        sv->ob_shash = -1;          /* invalidate cached hash value */
        return 0;
    }
    

    凭经验进行验证很容易。

    $ python -m timeit -s"s=''" "for i in xrange(10):s+='a'"
    1000000 loops, best of 3: 1.85 usec per loop
    $ python -m timeit -s"s=''" "for i in xrange(100):s+='a'"
    10000 loops, best of 3: 16.8 usec per loop
    $ python -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
    10000 loops, best of 3: 158 usec per loop
    $ python -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
    1000 loops, best of 3: 1.71 msec per loop
    $ python -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
    10 loops, best of 3: 14.6 msec per loop
    $ python -m timeit -s"s=''" "for i in xrange(1000000):s+='a'"
    10 loops, best of 3: 173 msec per loop
    

    不过,请务必注意,此优化不是Python规范的一部分。据我所知,它仅在cPython实现中。例如,对pypy或jython进行的相同经验测试可能会显示较旧的O(n ** 2)性能。

    $ pypy -m timeit -s"s=''" "for i in xrange(10):s+='a'"
    10000 loops, best of 3: 90.8 usec per loop
    $ pypy -m timeit -s"s=''" "for i in xrange(100):s+='a'"
    1000 loops, best of 3: 896 usec per loop
    $ pypy -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
    100 loops, best of 3: 9.03 msec per loop
    $ pypy -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
    10 loops, best of 3: 89.5 msec per loop
    

    到目前为止一切顺利,但随后,

    $ pypy -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
    10 loops, best of 3: 12.8 sec per loop
    

    哎呀,甚至比二次还差。因此,pypy可以在短字符串上做得很好,但是在较大的字符串上却表现不佳。



知识点
面圈网VIP题库

面圈网VIP题库全新上线,海量真题题库资源。 90大类考试,超10万份考试真题开放下载啦

去下载看看