从字典写入numpy数组

发布于 2021-01-29 16:41:38

我有一个字典文件头值(时间,帧数,年,月等)的字典,我想将其写入一个numpy数组。我目前拥有的代码如下:

    arr=np.array([(k,)+v for k,v in fileheader.iteritems()],dtype=["a3,a,i4,i4,i4,i4,f8,i4,i4,i4,i4,i4,i4,a10,a26,a33,a235,i4,i4,i4,i4,i4,i4"])

但是我得到一个错误,“只能将元组(而不是“ int”)连接到元组。

基本上,最终结果需要是存储整个文件头信息(512字节)和每个帧的数据(头和数据,每个帧49408字节)的数组。有没有更简单的方法可以做到这一点?

编辑:为了澄清(也为我自己),我需要将数据从文件的每个帧写入数组。我得到了matlab代码作为基础。这是给我的代码的粗略概念:

data.frame=zeros([512 96])
frame=uint8(fread(fid,[data.numbeams,512]),'uint8'))
data.frame=frame

如何将“框架”转换为python?

关注者
0
被浏览
41
1 个回答
  • 面试哥
    面试哥 2021-01-29
    为面试而生,有面试问题,就找面试哥。

    您最好将标头数据保留在dict中。您是否真的需要将其作为数组?(如果是这样,为什么?将标头包含在numpy数组中有一些优点,但是它比简单的更为复杂dict,并且不够灵活。)

    a的一个缺点dict是其键没有可预测的顺序。如果您需要按常规顺序(类似于C结构)将标头写回到磁盘,则需要分别存储字段的顺序及其值。在这种情况下,您可以考虑使用有序的dict(collections.OrderedDict)或仅将一个简单的类放在一起以保存标头数据并在其中存储顺序。

    除非有充分的理由将其放入numpy数组,否则您可能不希望这样做。

    但是,结构化数组将保留标题的顺序,并使将其二进制表示形式写入磁盘更加容易,但是在其他方面则不灵活。

    如果您确实想将标头设置为数组,则可以执行以下操作:

    import numpy as np
    
    # Lists can be modified, but preserve order. That's important in this case.
    names = ['Name1', 'Name2', 'Name3']
    # It's "S3" instead of "a3" for a string field in numpy, by the way
    formats = ['S3', 'i4', 'f8']
    
    # It's often cleaner to specify the dtype this way instead of as a giant string
    dtype = dict(names=names, formats=formats)
    
    # This won't preserve the order we're specifying things in!!
    # If we iterate through it, things may be in any order.
    header = dict(Name1='abc', Name2=456, Name3=3.45)
    
    # Therefore, we'll be sure to pass things in in order...
    # Also, np.array will expect a tuple instead of a list for a structured array...
    values = tuple(header[name] for name in names)
    header_array = np.array(values, dtype=dtype)
    
    # We can access field in the array like this...
    print header_array['Name2']
    
    # And dump it to disk (similar to a C struct) with
    header_array.tofile('test.dat')
    

    另一方面,如果您只想访问标头中的值,则将其保留为dict。这样比较简单。


    根据听起来您正在做的事情,我会做这样的事情。我使用numpy数组读取标头,但标头值实际上存储为类属性(以及标头数组)。

    这看起来比实际要复杂。

    我只是在定义两个新类,一个用于父文件,一个用于框架。您可以用更少的代码来完成相同的事情,但这为您完成更复杂的事情奠定了基础。

    import numpy as np
    
    class SonarFile(object):
        # These define the format of the file header
        header_fields = ('num_frames', 'name1', 'name2', 'name3')
        header_formats = ('i4', 'f4', 'S10', '>I4')
    
        def __init__(self, filename):
            self.infile = open(filename, 'r')
            dtype = dict(names=self.header_fields, formats=self.header_formats)
    
            # Read in the header as a numpy array (count=1 is important here!)
            self.header = np.fromfile(self.infile, dtype=dtype, count=1)
    
            # Store the position so we can "rewind" to the end of the header
            self.header_length = self.infile.tell()
    
            # You may or may not want to do this (If the field names can have
            # spaces, it's a bad idea). It will allow you to access things with
            # sonar_file.Name1 instead of sonar_file.header['Name1'], though.
            for field in self.header_fields:
                setattr(self, field, self.header[field])
    
        # __iter__ is a special function that defines what should happen when we  
        # try to iterate through an instance of this class.
        def __iter__(self):
            """Iterate through each frame in the dataset."""
            # Rewind to the end of the file header
            self.infile.seek(self.header_length)
    
            # Iterate through frames...
            for _ in range(self.num_frames):
                yield Frame(self.infile)
    
        def close(self):
            self.infile.close()
    
    class Frame(object):
        header_fields = ('width', 'height', 'name')
        header_formats = ('i4', 'i4', 'S20')
        data_format = 'f4'
    
        def __init__(self, infile):
            dtype = dict(names=self.header_fields, formats=self.header_formats)
            self.header = np.fromfile(infile, dtype=dtype, count=1)
    
            # See discussion above...
            for field in self.header_fields:
                setattr(self, field, self.header[field])
    
            # I'm assuming that the size of the frame is in the frame header...
            ncols, nrows = self.width, self.height
    
            # Read the data in
            self.data = np.fromfile(infile, self.data_format, count=ncols * nrows)
    
            # And reshape it into a 2d array.
            # I'm assuming C-order, instead of Fortran order.
            # If it's fortran order, just do "data.reshape((ncols, nrows)).T"
            self.data = self.data.reshape((nrows, ncols))
    

    您将使用类似于以下内容的方法:

    dataset = SonarFile('input.dat')
    
    for frame in dataset:
        im = frame.data
        # Do something...
    


知识点
面圈网VIP题库

面圈网VIP题库全新上线,海量真题题库资源。 90大类考试,超10万份考试真题开放下载啦

去下载看看